Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
History
News
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/f9/e4/6f/f9e46fac-f7bd-423c-b1a5-a7f1feb794fc/mza_11591368084059181858.jpg/600x600bb.jpg
Tech made Easy
Tech Guru
27 episodes
6 days ago
"Welcome to Tech Made Easy, the podcast where we dive deep into cutting-edge technical research papers, breaking down complex ideas into insightful discussions. Each episode, two tech enthusiasts explore a different research paper, simplifying the jargon, debating key points, and sharing their thoughts on its impact on the field. Whether you're a professional or a curious learner, join us for a geeky yet accessible journey through the world of technical research."
Show more...
Technology
RSS
All content for Tech made Easy is the property of Tech Guru and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
"Welcome to Tech Made Easy, the podcast where we dive deep into cutting-edge technical research papers, breaking down complex ideas into insightful discussions. Each episode, two tech enthusiasts explore a different research paper, simplifying the jargon, debating key points, and sharing their thoughts on its impact on the field. Whether you're a professional or a curious learner, join us for a geeky yet accessible journey through the world of technical research."
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42114207/42114207-1727538975953-9c21613c9d9cf.jpg
DeepSeek-R1: Reasoning via Reinforcement Learning
Tech made Easy
18 minutes 36 seconds
9 months ago
DeepSeek-R1: Reasoning via Reinforcement Learning

DeepSeek-AI introduces DeepSeek-R1, a reasoning model developed through reinforcement learning (RL) and distillation techniques. The research explores two models: DeepSeek-R1-Zero, trained purely via RL, and DeepSeek-R1, which incorporates multi-stage training and "cold-start" data before RL to improve reasoning capabilities and readability. The paper highlights DeepSeek-R1-Zero's emergent reasoning behaviors and DeepSeek-R1's performance comparable to OpenAI's o1-1217 on reasoning tasks. Distillation from DeepSeek-R1 is used to create smaller, more efficient models, demonstrating that reasoning patterns can be effectively transferred. The research also details the challenges and unsuccessful attempts during development, such as using Process Reward Models and Monte Carlo Tree Search. The models and distilled versions are open-sourced to support further research in the community.


Tech made Easy
"Welcome to Tech Made Easy, the podcast where we dive deep into cutting-edge technical research papers, breaking down complex ideas into insightful discussions. Each episode, two tech enthusiasts explore a different research paper, simplifying the jargon, debating key points, and sharing their thoughts on its impact on the field. Whether you're a professional or a curious learner, join us for a geeky yet accessible journey through the world of technical research."