Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
History
Music
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/bb/e9/4e/bbe94ef9-d20f-a255-4eac-b28699cd49f3/mza_321834318294472925.jpg/600x600bb.jpg
Super Prompt: Generative AI
Tony Wan
29 episodes
2 days ago
Description: AI agents from OpenAI, Google, and Anthropic promise to act on your behalf—booking flights, handling tasks, making decisions. What kind of agency do these systems actually have? And whose interests are they serving? Enterprise AI agents are already deployed in customer support, code generation, and task automation. Consumer agents—ChatGPT Agent Mode, personal task assistants—face a wider gap between marketing promises and actual capabilities. The alignment problem: agents need ac...
Show more...
Technology
RSS
All content for Super Prompt: Generative AI is the property of Tony Wan and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Description: AI agents from OpenAI, Google, and Anthropic promise to act on your behalf—booking flights, handling tasks, making decisions. What kind of agency do these systems actually have? And whose interests are they serving? Enterprise AI agents are already deployed in customer support, code generation, and task automation. Consumer agents—ChatGPT Agent Mode, personal task assistants—face a wider gap between marketing promises and actual capabilities. The alignment problem: agents need ac...
Show more...
Technology
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/bb/e9/4e/bbe94ef9-d20f-a255-4eac-b28699cd49f3/mza_321834318294472925.jpg/600x600bb.jpg
LLM Benchmarks: How to Know Which AI Is Better
Super Prompt: Generative AI
10 minutes
1 year ago
LLM Benchmarks: How to Know Which AI Is Better
Beyond ChatGPT and Gemini: Anthropic's Claude and the $4 billion Amazon investment. How AI industry benchmarks work, including LMSYS Arena Elo and MMLU (Measuring Massive Multitask Language Understanding). How benchmarks are constructed, what they measure, and how to use them to evaluate LLMs. Solo episode. Anthropic's Claude https://claude.ai [Note: I am not sponsored by Anthropic] LMSYS Leaderboard https://chat.lmsys.org/?leaderboard To stay in touch, sign up for our newsletter at ht...
Super Prompt: Generative AI
Description: AI agents from OpenAI, Google, and Anthropic promise to act on your behalf—booking flights, handling tasks, making decisions. What kind of agency do these systems actually have? And whose interests are they serving? Enterprise AI agents are already deployed in customer support, code generation, and task automation. Consumer agents—ChatGPT Agent Mode, personal task assistants—face a wider gap between marketing promises and actual capabilities. The alignment problem: agents need ac...