Embodied AI 101

EXPLORE

Society & Culture

© 2024 PodJoint

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/ca/27/4d/ca274d47-0366-f9f6-dd85-ccc1be9810c1/mza_769800370802724180.jpg/600x600bb.jpg

Embodied AI 101

Shaoqing Tan

24 episodes

2 months ago

Stay in the loop on research in AI and physical intelligence.

Show more...

All content for Embodied AI 101 is the property of Shaoqing Tan and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Stay in the loop on research in AI and physical intelligence.

Show more...

Episodes (20/24)

Embodied AI 101

Episode 24: DINOv3 and the Next Generation of Visual Foundation Models

Hello and welcome to Embodied AI 101. Today, we dive into a critical review of DINOv3, a 2025 vision model from Meta AI that marks a major step toward general-purpose visual foundation models. If you’re an AI professional or researcher with an eye on vision-language models (and maybe a toe in robotics), this episode is for you. We’ll explore how DINOv3 fits into the broader movement toward massive-scale, task-agnostic representation learners – models trained not for one narrow task, but to b

2 months ago

46 minutes 30 seconds

Embodied AI 101

Episode 23: A Critical Look at Hume VLA

Introduction – Two Minds Inside One Robot Modern embodied AI is drawing inspiration from human cognition. Psychologist Daniel Kahneman famously described a System 1 (fast, intuitive thinking) and System 2 (slow, deliberative reasoning). In robotics, researchers are exploring whether a dual-system approach can give robots the reflexes of an athlete and the deliberation of a chess master. The 2025 paper “Hume: Introducing System-2 Thinking in Visual-Language-Action Model” steps boldly into

2 months ago

50 minutes 24 seconds

Embodied AI 101

Episode 22: Critical Review of π0.5

Introduction Roboticists have long dreamed of generalist robots that can step out of the lab and perform useful tasks in unstructured, everyday settings. The challenge is generalization – can a robot handle a task in a brand-new environment with new objects, not just the scenarios it was trained on? The π0.5 model (pronounced “pi zero-point-five”) is a Vision-Language-Action (VLA) model proposed in April 2025 as a step toward this goal. It builds on an earlier π0 model and aims to endow r

2 months ago

1 hour 10 minutes 16 seconds

Embodied AI 101

Episode 21: Deep Dive: ReinboT and the Fusion of RL with Vision-Language-Action

Introduction: A New Twist in Robot Learning Hello and welcome to Robotics Unwrapped, where we explore cutting-edge advances in robot learning. Today, we’re diving into ReinboT, a model fresh out of ICML 2025 that promises to amplify robot manipulation by weaving reinforcement learning (RL) ideas into vision-language-action (VLA) models. Imagine a robot that not only interprets what it sees and the instructions it’s given, but also has a sense of how rewarding its actions will be. ReinboT

3 months ago

1 hour 1 minute 24 seconds

Embodied AI 101

Episode 20: RIPT-VLA - The Fine-Tuning Revolution

Section 1: The Distributional Shift Problem At the heart of modern robotics lies a fundamental challenge. We train our most advanced models, known as Vision-Language-Action models or VLAs, using a technique called Behavioral Cloning. In essence, the robot watches millions of perfect, expert demonstrations and learns to imitate them. This is a powerful starting point, but it suffers from a critical flaw known as "covariate shift" or "distributional shift." Imagine learning to drive a car

3 months ago

6 minutes 36 seconds

Embodied AI 101

Episode 19: The Key to Adaptable Robots: Reinforcement Learning

Imagine a robot helper in your home. You ask it, "Hey, could you put the milk in the fridge?" Simple enough. But what if you bought a different brand of milk today, in a carton that’s shaped a little differently? What if the fridge door is slightly ajar, or the lighting in the kitchen is a bit dimmer than usual? For most of today's advanced robots, these tiny, everyday variations can cause a total system failure. They are often trained by simply watching and copying human actions, a method

3 months ago

10 minutes 59 seconds

Embodied AI 101

Episode 18: A Technical Blueprint for Your Own Sim2Real Project

In our last episode, we saw how foundation models provide robots with a "common sense" understanding of the world. But a passive understanding is not enough. For a robot to be truly general-purpose, it must act, adapt, and learn within the physical world. Today, we move beyond the conceptual to the technical blueprint. What are the core engineering and algorithmic components required to build a truly generic robot? The answer lies in the synthesis of three key technologies: gradient-free opt

3 months ago

7 minutes 5 seconds

Embodied AI 101

Episode 17: The Role of Foundation Models in Sim2Real

Welcome to the final episode of our series on the Sim2Real challenge. We've been on a long journey, from dissecting the "reality gap" to exploring techniques for building robust and adaptable robots. We've learned how to embrace chaos with Domain Randomization, how to train against adversaries, and how to create policies that can learn and adapt on the fly. But today, we're looking at a new frontier, a paradigm shift in AI that is poised to revolutionize robotics: Foundation Models. This i

3 months ago

4 minutes 19 seconds

Embodied AI 101

Episode 16: Transfer Learning and Meta-Learning in Sim2Real

Welcome back. In our previous episodes, we've explored powerful techniques like Domain Randomization and Adversarial Learning. These methods are all about building a single, hyper-robust policy that can withstand the harsh realities of the physical world. But what if there's a different approach? What if, instead of building one policy to rule them all, we could create policies that are masters of adaptation? This is the focus of today's episode: Transfer Learning and Meta-Learning. We're

3 months ago

4 minutes 31 seconds

Embodied AI 101

Episode 15: Adversarial Approaches to Sim2Real

Welcome back to our series on the Sim2Real challenge. In our last episode, we explored Domain Randomization, a technique where we embrace chaos in our simulations to build robust robots. We learned that by training on a wide variety of simulated conditions, we can create policies that are less sensitive to the "reality gap." But what if, instead of preparing for a broad range of possibilities, we could prepare for the worst-case scenario? What if we could find the "chinks in the armor" of

3 months ago

4 minutes 5 seconds

Embodied AI 101

Episode 14: Domain Randomization: A Key Technique for Sim2Real Transfer

Welcome back to the podcast. In our last episode, we dissected the "reality gap"—the chasm between the clean, predictable world of simulation and the messy, chaotic real world. We learned that this gap is a major roadblock in robotics, preventing policies trained in simulation from working effectively on physical robots. Today, we're exploring one of the most powerful and counter-intuitive techniques for bridging this gap: Domain Randomization. The core idea is simple: if you want your rob

3 months ago

5 minutes 10 seconds

Embodied AI 101

Episode 13: The Sim2Real Challenge: Why Virtual Robots Struggle in the Real World

Welcome to the podcast, where we explore the cutting edge of AI and robotics. Today, we're diving into one of the most fundamental challenges in robotics: the Sim2Real problem. Imagine you're a pilot. You've spent hundreds of hours in a state-of-the-art flight simulator. You can handle any emergency, any weather condition the simulator throws at you. You're a top ace... in the virtual world. But the first time you step into a real cockpit, you realize things are different. The controls fee

3 months ago

6 minutes

Embodied AI 101

Sim2Real Challenge

The transfer of policies from simulation to physical hardware, a process known as Sim2Real, represents one of the most significant and persistent challenges in modern robotics. While simulation offers a safe, scalable, and parallelizable environment for reinforcement learning, the utility of this approach is fundamentally limited by the "reality gap"—the discrepancy between the dynamics of the simulated world and those of the real world. A policy optimized in a flawed simulation will inherit

3 months ago

49 minutes 40 seconds

Embodied AI 101

Episode 5: Beyond OpenVLA – The Evolving Landscape of Vision-Language-Action Systems

This is it – our final episode in the series. So far, we’ve focused on OpenVLA itself. Now we’re zooming out to the bigger picture of Vision-Language-Action models and where things are headed. Think of this as a round-table tour of the “who’s who” and “what’s next” in VLA. We’ll discuss some companion models, improvements, and future themes: from Google’s robotic transformers to new open models like Octo, to special techniques like RT-Trajectory, and even the latest forays into humanoid robo

3 months ago

17 minutes 25 seconds

Embodied AI 101

Episode 4: From Simulation to Reality – Embodiment and Real-World Deployment

Welcome back! So far we’ve tackled the concept of OpenVLA and the training of OpenVLA. Now it’s time for the real fun: robots! In this episode, we’ll discuss how OpenVLA connects to actual robot hardware and different embodiments. How does one model control many types of robot arms? What happens when you take it off the computer and put it on a real robot in a real environment? We’ll cover the generalization across embodiments, the process of deploying in the real world, and some impressive

3 months ago

13 minutes 7 seconds

Embodied AI 101

Episode 3: Training a Robot’s Brain – OpenVLA’s Learning and Adaptation

Welcome back to our OpenVLA deep dive. In the last episode we figured out what the model’s parts are and how they operate. Now it’s time for the next logical question: How do you teach such a model to do all those tasks? Today we’re going to talk about OpenVLA’s training process—both its initial pretraining on a huge dataset and the ways we can fine-tune it afterward. If Episode 2 was the “hardware” (so to speak) of the brain, consider this the “education and practice” that made it smart.

3 months ago

11 minutes 42 seconds

Embodied AI 101

Episode 2: Under the Hood of OpenVLA – Architecture and Inference

Welcome back! Last time we talked about what OpenVLA is at a high level. Now it’s time to lift the hood and see how this engine runs. How can one AI model look at a camera image, read a command, and then generate robot arm motions to fulfill it? In this episode, we’ll break down OpenVLA’s architecture and discuss how it processes inputs and produces actions. If you’re into AI model design (or just curious how the sausage is made), this one’s for you. So, let’s start with the big picture. O

3 months ago

10 minutes 39 seconds

Embodied AI 101

Episode 1: From Vision and Language to Action – An Introduction to VLAs and OpenVLA

Hello and welcome! In this first episode, we’re laying the groundwork for our journey into Vision-Language-Action systems. Today we’ll answer: What is a Vision-Language-Action model, and why is OpenVLA making waves in robotics? So grab your headphones and let’s dive into the world where seeing, speaking, and doing all come together. Imagine telling a robot, “Pick up the red block and put it on the table,” and it just does it—no hard coding, no task-specific training required. That’s the pr

3 months ago

6 minutes 15 seconds

Embodied AI 101

Episode 6: The Road Ahead – GR00T N1.5 and the Future of Humanoid AI

Hello and welcome to the final episode of our deep dive on NVIDIA’s GR00T N1. It’s been a fascinating journey so far, and now it’s time to look forward. What comes after GR00T N1? How is this model evolving, and what does it mean for the future of AI-powered humanoid robots? In this episode, we’ll talk about the immediate next step – the GR00T N1.5 update – and then zoom out to the broader implications for the industry and what might lie ahead in the world of generalist robot intelligence.

3 months ago

10 minutes 14 seconds

Embodied AI 101

Episode 5: Real Robots, Real Results – GR00T N1 in Action

Welcome to Episode 5! Now that we know what GR00T N1 is capable of in theory and controlled tests, let’s explore how it’s being used in practice. This episode is all about real-world deployments and industry adoption. We’ll discuss how companies and research teams are integrating GR00T N1 into actual robots, and what early results they’re seeing. It’s one thing to have a cool demo in a lab, but it’s another to bring that tech into the real world where things are messy, unpredictable, and whe

3 months ago

11 minutes 53 seconds