In the grand finale of "All Things LLM," hosts Alex and Ben look ahead to the bleeding edge—and reflect on the ultimate question for AI: can we ever truly understand how these models think? Inside this episode: The rise of reasoning models: Discover why the next leap for AI isn’t just bigger models, but smarter thinking. Explore how OpenAI’s o1 and DeepSeek-R1 represent a paradigm shift, moving from brute-force “pre-train and scale” to dynamic, inference-time reasoning. Learn how these new mo...
All content for All Things LLM is the property of Mr. Dew and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
In the grand finale of "All Things LLM," hosts Alex and Ben look ahead to the bleeding edge—and reflect on the ultimate question for AI: can we ever truly understand how these models think? Inside this episode: The rise of reasoning models: Discover why the next leap for AI isn’t just bigger models, but smarter thinking. Explore how OpenAI’s o1 and DeepSeek-R1 represent a paradigm shift, moving from brute-force “pre-train and scale” to dynamic, inference-time reasoning. Learn how these new mo...
The Human Touch - How RLHF Aligns Models with Our Values
All Things LLM
5 minutes
1 month ago
The Human Touch - How RLHF Aligns Models with Our Values
How do we make AI not just smart, but safe and genuinely helpful? In this episode of "All Things LLM," Alex and Ben break down the vital process of alignment—transforming a powerful language model into a trustworthy assistant you can rely on. Inside this episode: What is RLHF? Discover Reinforcement Learning from Human Feedback—the multi-stage process that transforms next-word predictors into helpful, instruction-following bots like ChatGPT or Claude.Step-by-Step Alignment:Supervised Fine-Tun...
All Things LLM
In the grand finale of "All Things LLM," hosts Alex and Ben look ahead to the bleeding edge—and reflect on the ultimate question for AI: can we ever truly understand how these models think? Inside this episode: The rise of reasoning models: Discover why the next leap for AI isn’t just bigger models, but smarter thinking. Explore how OpenAI’s o1 and DeepSeek-R1 represent a paradigm shift, moving from brute-force “pre-train and scale” to dynamic, inference-time reasoning. Learn how these new mo...