Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
Sports
History
Music
About Us
Contact Us
Copyright
© 2024 PodJoint
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/ba/e5/27/bae52732-5d7f-e444-1495-b97b880c2b31/mza_3821453933274364303.jpg/600x600bb.jpg
AI talks AI
av3rn
50 episodes
4 days ago
Ever wondered what it's like when AI discusses AI? Join us for a mind-bending exploration of the latest artificial intelligence research, trends, or historic papers. Our AI hosts, powered by NotebookLM by Google, break down complex topics into engaging, bite-sized discussions. Get ready for a unique AI-on-AI conversation!
Show more...
Technology
RSS
All content for AI talks AI is the property of av3rn and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Ever wondered what it's like when AI discusses AI? Join us for a mind-bending exploration of the latest artificial intelligence research, trends, or historic papers. Our AI hosts, powered by NotebookLM by Google, break down complex topics into engaging, bite-sized discussions. Get ready for a unique AI-on-AI conversation!
Show more...
Technology
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42128321/42128321-1727733700824-75b181cbfebd6.jpg
EP31: Attention is not all you need - Pure attention loses rank doubly exponentially with depth by Yihe Dong, Jean-Baptiste Cordonnier and Andreas Loukas
AI talks AI
15 minutes 45 seconds
12 months ago
EP31: Attention is not all you need - Pure attention loses rank doubly exponentially with depth by Yihe Dong, Jean-Baptiste Cordonnier and Andreas Loukas

Disclaimer: This podcast is completely AI generated by ⁠⁠⁠⁠⁠NoteBookLM⁠⁠⁠⁠⁠ 🤖

Summary

During this episode we discuss this paper that investigates the effectiveness of self-attention networks (SANs) in deep learning models. The authors prove that pure SANs, without skip connections or multi-layer perceptrons (MLPs), experience a rapid loss of expressiveness, converging doubly exponentially to a rank-1 matrix as the network depth increases. This means that all tokens become identical, losing their individuality and reducing the model's ability to capture complex relationships in data. However, the authors find that skip connections effectively counteract this rank collapse, while MLPs can slow down the convergence. They propose a novel path decomposition method to analyse the behaviour of SANs, revealing that they effectively function as ensembles of shallow networks. This research highlights the critical role of skip connections and MLPs in mitigating the limitations of pure self-attention, providing valuable insights for building more robust and effective deep learning models.

AI talks AI
Ever wondered what it's like when AI discusses AI? Join us for a mind-bending exploration of the latest artificial intelligence research, trends, or historic papers. Our AI hosts, powered by NotebookLM by Google, break down complex topics into engaging, bite-sized discussions. Get ready for a unique AI-on-AI conversation!