Episode 5: Scaling Feedback, Forgetting Smartly, and Video Agents: AI’s Next Frontier

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/1e/15/01/1e150157-3dfb-239d-9019-72f0d7d67bfb/mza_17852249358058477486.jpg/600x600bb.jpg

Hugging Face Trending Papers

Code Coin Cognition LLC

11 episodes

1 day ago

Stay ahead in AI with Hugging Face Trending Papers — your daily digest of trending arXiv research. Hosts Vikas and Roger break down the most talked-about papers in machine learning, LLMs, generative AI, and robotics in just 5–6 minutes. Clear, conversational insights on problems, methods, benchmarks, and real-world impact — no jargon overload. Perfect for researchers, engineers, students, and AI enthusiasts.

Technology

RSS

All content for Hugging Face Trending Papers is the property of Code Coin Cognition LLC and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/44444894/44444894-1758687183138-0b9a84c84d1ec.jpg

Episode 5: Scaling Feedback, Forgetting Smartly, and Video Agents: AI’s Next Frontier

Hugging Face Trending Papers

7 minutes 55 seconds

1 month ago

Episode 5: Scaling Feedback, Forgetting Smartly, and Video Agents: AI’s Next Frontier

1. RLAIF at Scale: Reinforcement Learning from AI Feedback for Multi-Turn Reasoning

This paper explores using AI-generated feedback instead of expensive human labels to train reasoning models. The authors show that Reinforcement Learning from AI Feedback (RLAIF) can match or even outperform models trained with limited human feedback, especially in multi-turn reasoning tasks.

2. Learning to Forget: Dynamic Memory Compression in Long-Context Transformers

The authors propose a method for making transformers more efficient on long contexts by teaching them to “forget” unimportant details. Their dynamic memory compression reduces memory usage by over 40% while maintaining — and sometimes improving — accuracy on long-sequence benchmarks.

3. VidAgent: Scalable Video Agents with Spatio-Temporal Reasoning

This work introduces VidAgent, a system that can understand and reason over long videos by grounding events in both space and time. It achieves state-of-the-art performance on video QA benchmarks and opens up possibilities for advanced video search and monitoring applications.