Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
Technology
Health & Fitness
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/6a/24/22/6a242243-a886-3562-51aa-5b0137909c8b/mza_6305134645633578970.jpg/600x600bb.jpg
The AI Research Deep Dive
The AI Research Deep Dive
36 episodes
6 days ago
From arXiv to insight: a daily tour of cutting-edge AI papers. The AI Research Deep Dive podcast dives into a new groundbreaking research paper every day. It combs through the most important details and results to give you a great idea of what the paper accomplishes and how it gets there.
Show more...
Science
RSS
All content for The AI Research Deep Dive is the property of The AI Research Deep Dive and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
From arXiv to insight: a daily tour of cutting-edge AI papers. The AI Research Deep Dive podcast dives into a new groundbreaking research paper every day. It combs through the most important details and results to give you a great idea of what the paper accomplishes and how it gets there.
Show more...
Science
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/43949260/43949260-1750798569136-3391783a0fb9a.jpg
Defeating Nondeterminism in LLM Inference
The AI Research Deep Dive
15 minutes 26 seconds
1 month ago
Defeating Nondeterminism in LLM Inference

Link: https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/

This episode of "The AI Research Deep Dive" explores a blog post from Thinking Machines Lab that solves a frustrating mystery: why large language models give different answers to the same prompt even with deterministic settings. The host explains how the authors debunked the common theory of random floating-point errors, instead identifying the true culprit as a lack of "batch invariance" in modern inference libraries. Listeners will learn how the way a user's request is batched with others randomly changes the underlying GPU calculations, leading to different results. The episode covers the team's solution—custom-engineered GPU kernels that enforce consistency—and discusses the profound implications for achieving perfect reproducibility and enabling more stable, "truly on-policy" reinforcement learning.

The AI Research Deep Dive
From arXiv to insight: a daily tour of cutting-edge AI papers. The AI Research Deep Dive podcast dives into a new groundbreaking research paper every day. It combs through the most important details and results to give you a great idea of what the paper accomplishes and how it gets there.