The Cartesian Cafe is the podcast where an expert guest and Timothy Nguyen map out scientific and mathematical subjects in detail. This collaborative journey with other experts will have us writing down formulas, drawing pictures, and reasoning about them together on a whiteboard. If you’ve been longing for a deeper dive into the intricacies of scientific subjects, then this is the podcast for you. Topics covered include mathematics, physics, machine learning, artificial intelligence, and computer science.
Content also viewable on YouTube: www.youtube.com/timothynguyen and Spotify.
Timothy Nguyen is a mathematician and AI researcher working in industry.
Homepage: www.timothynguyen.com, Twitter: @IAmTimNguyen
Patreon: www.patreon.com/timothynguyen
All content for The Cartesian Cafe is the property of Timothy Nguyen and is served directly from their servers
with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
The Cartesian Cafe is the podcast where an expert guest and Timothy Nguyen map out scientific and mathematical subjects in detail. This collaborative journey with other experts will have us writing down formulas, drawing pictures, and reasoning about them together on a whiteboard. If you’ve been longing for a deeper dive into the intricacies of scientific subjects, then this is the podcast for you. Topics covered include mathematics, physics, machine learning, artificial intelligence, and computer science.
Content also viewable on YouTube: www.youtube.com/timothynguyen and Spotify.
Timothy Nguyen is a mathematician and AI researcher working in industry.
Homepage: www.timothynguyen.com, Twitter: @IAmTimNguyen
Patreon: www.patreon.com/timothynguyen
Greg Yang | Large N Limits: Random Matrices & Neural Networks
The Cartesian Cafe
3 hours 1 minute 27 seconds
2 years ago
Greg Yang | Large N Limits: Random Matrices & Neural Networks
Greg Yang is a mathematician and AI researcher at Microsoft Research who for the past several years has done incredibly original theoretical work in the understanding of large artificial neural networks. Greg received his bachelors in mathematics from Harvard University in 2018 and while there won the Hoopes prize for best undergraduate thesis. He also received an Honorable Mention for the Morgan Prize for Outstanding Research in Mathematics by an Undergraduate Student in 2018 and was an invited speaker at the International Congress of Chinese Mathematicians in 2019.
In this episode, we get a sample of Greg's work, which goes under the name "Tensor Programs" and currently spans five highly technical papers. The route chosen to compress Tensor Programs into the scope of a conversational video is to place its main concepts under the umbrella of one larger, central, and time-tested idea: that of taking a large N limit. This occurs most famously in the Law of Large Numbers and the Central Limit Theorem, which then play a fundamental role in the branch of mathematics known as Random Matrix Theory (RMT). We review this foundational material and then show how Tensor Programs (TP) generalizes this classical work, offering new proofs of RMT. We conclude with the applications of Tensor Programs to a (rare!) rigorous theory of neural networks.
Patreon: https://www.patreon.com/timothynguyen
Part I. Introduction
00:00:00 : Biography
00:02:45 : Harvard hiatus 1: Becoming a DJ
00:07:40 : I really want to make AGI happen (back in 2012)
00:09:09 : Impressions of Harvard math
00:17:33 : Harvard hiatus 2: Math autodidact
00:22:05 : Friendship with Shing-Tung Yau
00:24:06 : Landing a job at Microsoft Research: Two Fields Medalists are all you need
00:26:13 : Technical intro: The Big Picture
00:28:12 : Whiteboard outline
Part II. Classical Probability Theory
00:37:03 : Law of Large Numbers
00:45:23 : Tensor Programs Preview
00:47:26 : Central Limit Theorem
00:56:55 : Proof of CLT: Moment method
1:00:20 : Moment method explicit computations
Part III. Random Matrix Theory
1:12:46 : Setup
1:16:55 : Moment method for RMT
1:21:21 : Wigner semicircle law
Part IV. Tensor Programs
1:31:03 : Segue using RMT
1:44:22 : TP punchline for RMT
1:46:22 : The Master Theorem (the key result of TP)
1:55:04 : Corollary: Reproof of RMT results
1:56:52 : General definition of a tensor program
Part V. Neural Networks and Machine Learning
2:09:05 : Feed forward neural network (3 layers) example
2:19:16 : Neural network Gaussian Process
2:23:59 : Many distinct large N limits for neural networks
2:27:24 : abc parametrizations (Note: "a" is absorbed into "c" here): variance and learning rate scalings
2:36:54 : Geometry of space of abc parametrizations
2:39:41: Kernel regime
2:41:32 : Neural tangent kernel
2:43:35: (No) feature learning
2:48:42 : Maximal feature learning
2:52:33 : Current problems with deep learning
2:55:02 : Hyperparameter transfer (muP)
3:00:31 : Wrap up
Further Reading:
Tensor Programs I, II, III, IV, V by Greg Yang and coauthors.
Twitter: @iamtimnguyen
Webpage: http://www.timothynguyen.org
The Cartesian Cafe
The Cartesian Cafe is the podcast where an expert guest and Timothy Nguyen map out scientific and mathematical subjects in detail. This collaborative journey with other experts will have us writing down formulas, drawing pictures, and reasoning about them together on a whiteboard. If you’ve been longing for a deeper dive into the intricacies of scientific subjects, then this is the podcast for you. Topics covered include mathematics, physics, machine learning, artificial intelligence, and computer science.
Content also viewable on YouTube: www.youtube.com/timothynguyen and Spotify.
Timothy Nguyen is a mathematician and AI researcher working in industry.
Homepage: www.timothynguyen.com, Twitter: @IAmTimNguyen
Patreon: www.patreon.com/timothynguyen