Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Technology
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/79/07/24/79072480-6fc7-4ee3-f7a0-f3320ee91965/mza_6953037226688570363.jpg/600x600bb.jpg
Agentic Horizons
Dan Vanderboom
106 episodes
6 days ago
Agentic Horizons is an AI-hosted podcast exploring the cutting edge of artificial intelligence. Each episode dives into topics like generative AI, agentic systems, and prompt engineering, with content generated by AI agents based on research papers and articles from top AI experts. Whether you're an AI enthusiast, developer, or industry professional, this show offers fresh, AI-driven insights into the technologies shaping the future.
Show more...
Technology
RSS
All content for Agentic Horizons is the property of Dan Vanderboom and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Agentic Horizons is an AI-hosted podcast exploring the cutting edge of artificial intelligence. Each episode dives into topics like generative AI, agentic systems, and prompt engineering, with content generated by AI agents based on research papers and articles from top AI experts. Whether you're an AI enthusiast, developer, or industry professional, this show offers fresh, AI-driven insights into the technologies shaping the future.
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/42197147/42197147-1729541904961-f50e944d8d66a.jpg
Do LLMs Estimate Uncertainty Well?
Agentic Horizons
6 minutes 50 seconds
9 months ago
Do LLMs Estimate Uncertainty Well?

This episode explores the challenges of uncertainty estimation in large language models (LLMs) for instruction-following tasks. While LLMs show promise as personal AI agents, they often struggle to accurately assess their uncertainty, leading to deviations from guidelines. The episode highlights the limitations of existing uncertainty methods, like semantic entropy, which focus on fact-based tasks rather than instruction adherence.Key findings from the evaluation of six uncertainty estimation methods across four LLMs reveal that current approaches struggle with subtle instruction-following errors. The episode introduces a new benchmark dataset with Controlled and Realistic versions to address the limitations of existing datasets, ensuring a more accurate evaluation of uncertainty.


The discussion also covers the performance of various methods, with self-evaluation excelling in simpler tasks and logit-based approaches showing promise in more complex ones. Smaller models sometimes outperform larger ones in self-evaluation, and internal probing of model states proves effective. The episode concludes by emphasizing the need for further research to improve uncertainty estimation and ensure trustworthy AI agents.


https://arxiv.org/pdf/2410.14582

Agentic Horizons
Agentic Horizons is an AI-hosted podcast exploring the cutting edge of artificial intelligence. Each episode dives into topics like generative AI, agentic systems, and prompt engineering, with content generated by AI agents based on research papers and articles from top AI experts. Whether you're an AI enthusiast, developer, or industry professional, this show offers fresh, AI-driven insights into the technologies shaping the future.