Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/4f/62/4d/4f624d5e-cad2-c73a-ace2-4f7e28331b57/mza_13418445059889584352.jpg/600x600bb.jpg
Tool Use - AI Conversations
Mike Bird
64 episodes
1 week ago
The latest AI tools and strategies, featuring top builders, entrepreneurs, and researchers. Every Tuesday, get actionable knowledge about AI services that you can use today. - Conversations with AI builders and founders - Practical tips for applying AI immediately - Live demos of cutting-edge AI products - Deep dives into open-source projects Join us as we explore the rapidly evolving AI landscape and discover tools that transform productivity, creativity, and business. Subscribe now and stay ahead of the AI curve
Show more...
Technology
RSS
All content for Tool Use - AI Conversations is the property of Mike Bird and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The latest AI tools and strategies, featuring top builders, entrepreneurs, and researchers. Every Tuesday, get actionable knowledge about AI services that you can use today. - Conversations with AI builders and founders - Practical tips for applying AI immediately - Live demos of cutting-edge AI products - Deep dives into open-source projects Join us as we explore the rapidly evolving AI landscape and discover tools that transform productivity, creativity, and business. Subscribe now and stay ahead of the AI curve
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/42038728/42038728-1750168649927-ff18f6f579e1a.jpg
The Right Way to Do AI Evals (ft Freddie Vargus)
Tool Use - AI Conversations
55 minutes 15 seconds
4 months ago
The Right Way to Do AI Evals (ft Freddie Vargus)

Are your AI agents unreliable? In this guide, we reveal a professional system for AI evals to help you build and ship better AI products, faster. Learn how to systematically test LLM performance, evaluate complex tool use, and improve multi-turn conversations. We break down the exact process for building a high-quality eval dataset, using milestones and minefields to track agent behaviour, and how to properly use an LLM as a judge without compromising quality. Stop guessing and start making real, measurable improvements to your AI today.


Check out Quotient AI

https://www.quotientai.co/


Sign up for A.I. coaching for professionals at: https://www.anetic.co


Get FREE AI tools

pip install tool-use-ai


Connect with us https://x.com/ToolUseAI

https://x.com/MikeBirdTech

https://x.com/freddie_v4


00:00:00 - intro

00:02:54 - Why You Need AI Evals

00:06:13 - How to Evaluate AI Agent Tool Use

00:29:24 - The Process for Building Your First Eval Dataset

00:42:44 - Using an LLM as a Judge The Right Way


Subscribe for more insights on AI tools, productivity, and AI evals.

Tool Use is a weekly conversation with AI experts brought to you by Anetic.

Tool Use - AI Conversations
The latest AI tools and strategies, featuring top builders, entrepreneurs, and researchers. Every Tuesday, get actionable knowledge about AI services that you can use today. - Conversations with AI builders and founders - Practical tips for applying AI immediately - Live demos of cutting-edge AI products - Deep dives into open-source projects Join us as we explore the rapidly evolving AI landscape and discover tools that transform productivity, creativity, and business. Subscribe now and stay ahead of the AI curve