A writer and a software engineer from Google's People + AI Research team explore the human choices that shape machine learning systems by building competing tic-tac-toe agents.
All content for Tic-Tac-Toe the Hard Way is the property of People + AI Research and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
A writer and a software engineer from Google's People + AI Research team explore the human choices that shape machine learning systems by building competing tic-tac-toe agents.
Give that model a treat! : Reinforcement learning explained
Tic-Tac-Toe the Hard Way
26 minutes
5 years ago
Give that model a treat! : Reinforcement learning explained
Switching gears, we focus on how Yannick’s been training his model using reinforcement learning. He explains the differences from David’s supervised learning approach. We find out how his system performs against a player that makes random tic-tac-toe moves.
Tic-Tac-Toe the Hard Way
A writer and a software engineer from Google's People + AI Research team explore the human choices that shape machine learning systems by building competing tic-tac-toe agents.