Home
Categories
EXPLORE
True Crime
Comedy
Science
Society & Culture
History
News
Technology
About Us
Contact Us
Copyright
© 2024 PodJoint
Loading...
0:00 / 0:00
Podjoint Logo
LV
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/c3/d4/d2/c3d4d2b9-300b-0df4-2adb-edde5db785b9/mza_14553845873083074119.png/600x600bb.jpg
AI Podcast
Kirill Solodskikh
1 episodes
4 months ago
Show more...
Mathematics
Business,
Science
RSS
All content for AI Podcast is the property of Kirill Solodskikh and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Show more...
Mathematics
Business,
Science
https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/c3/d4/d2/c3d4d2b9-300b-0df4-2adb-edde5db785b9/mza_14553845873083074119.png/600x600bb.jpg
AI Podcast: Quantization of Neural Networks, Part 1. Introduction, Definitions, Examples.
AI Podcast
19 minutes 8 seconds
7 months ago
AI Podcast: Quantization of Neural Networks, Part 1. Introduction, Definitions, Examples.
Quantization is a powerful technique for reducing memory usage and speeding up AI applications built with LLMs, diffusion models, CNNs, and other architectures. In fact, quantization is fundamental to all data compression—from JPEG and GIF to MP3 and MP4 (HEVC)! In this episode, we'll cover the basics of neural network quantization, laying the groundwork for future episodes where we'll dive into specific quantization algorithms. The AI Podcast is hosted by Kirill, CEO of TheStage AI. With his team's deep scientific and industrial expertise in neural network acceleration and deployment, they'll show you how to run AI anywhere and everywhere!   OUTLINE: 00:00 - Jingle! 01:24 - Structure of Podcast 01:46 - When and How to Use Quantization? 03:11 - Speedup or reduce memory? Or Both? 04:18 - Hardware with quantization support 05:28 - DNN compilers to run quantized networks 06:01 - What is quantization mathematically? 07:22 - Fake Quantized Tensors 08:43 - Symmetric, asymmetric, per-tensor, per-channel, per-group 09:43 - Quantized matrix multiplication 11:31 - Quantization algorithms 13:23 - Examples of PTQ and QAT 16:11 - Quantized parameters exists not in discrete space! Is it manifold? 18:08 - Details of the next episode!  
AI Podcast