Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
History
Music
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/93/de/2f/93de2f95-e84a-2a5a-2180-9713bbbd3f33/mza_16539530088674354596.jpg/600x600bb.jpg
Agora - The Marketplace of Ideas
Matthew Harris
98 episodes
5 days ago
Welcome to Agora, the Marketplace of Ideas I'd say the sky's the limit, but how can that be true when there are footprints on the moon. This is your home for bleeding edge tech and macro perspectives with just a bit of philosophy. Contributor: https://s3.news/
Show more...
Technology
RSS
All content for Agora - The Marketplace of Ideas is the property of Matthew Harris and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Welcome to Agora, the Marketplace of Ideas I'd say the sky's the limit, but how can that be true when there are footprints on the moon. This is your home for bleeding edge tech and macro perspectives with just a bit of philosophy. Contributor: https://s3.news/
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/25391569/25391569-1745083832202-163e67fbc912f.jpg
Reinforcement Learning for LLM Reasoning: The State of the Art
Agora - The Marketplace of Ideas
22 minutes 2 seconds
6 months ago
Reinforcement Learning for LLM Reasoning: The State of the Art

**This provides a comprehensive overview of using reinforcement learning (RL) to enhance the reasoning abilities of large language models (LLMs).** It contrasts conventional LLMs with newer reasoning models and highlights the potential of RL for strategic computation. The author explains key RL concepts like RLHF and PPO, then introduces more recent advancements such as GRPO and RLVR, exemplified by DeepSeek-R1's training. Finally, the article summarizes lessons from recent research papers, exploring topics like improving distilled models, addressing biases in RL algorithms, the emergence of reasoning capabilities, generalization across domains, and the ongoing debate about the primary drivers of LLM reasoning.

Agora - The Marketplace of Ideas
Welcome to Agora, the Marketplace of Ideas I'd say the sky's the limit, but how can that be true when there are footprints on the moon. This is your home for bleeding edge tech and macro perspectives with just a bit of philosophy. Contributor: https://s3.news/