Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Sports
History
Education
Business
About Us
Contact Us
Copyright
© 2024 PodJoint
Loading...
0:00 / 0:00
Podjoint Logo
US
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/41/5f/f4/415ff42b-f3e4-62d7-e017-6ccd1fe8b935/mza_9584984547342647834.jpg/600x600bb.jpg
Deep Papers
Arize AI
50 episodes
15 hours ago
Join us as we discuss Accurate KV Cache Quantization with Outlier Tokens Tracing, a deep dive into improving the efficiency of LLM inference. The authors enhance KV Cache quantization, a technique for reducing memory and compute costs during inference, by introducing a method to identify and exclude outlier tokens that hurt quantization accuracy, striking a better balance between efficiency and performance. Paper: https://arxiv.org/abs/2505.10938 Slides: https://bit.ly/45wolpr Join us for Ar...
Show more...
Mathematics
Technology,
Business,
Science
RSS
All content for Deep Papers is the property of Arize AI and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Join us as we discuss Accurate KV Cache Quantization with Outlier Tokens Tracing, a deep dive into improving the efficiency of LLM inference. The authors enhance KV Cache quantization, a technique for reducing memory and compute costs during inference, by introducing a method to identify and exclude outlier tokens that hurt quantization accuracy, striking a better balance between efficiency and performance. Paper: https://arxiv.org/abs/2505.10938 Slides: https://bit.ly/45wolpr Join us for Ar...
Show more...
Mathematics
Technology,
Business,
Science
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/41/5f/f4/415ff42b-f3e4-62d7-e017-6ccd1fe8b935/mza_9584984547342647834.jpg/600x600bb.jpg
How DeepSeek is Pushing the Boundaries of AI Development
Deep Papers
29 minutes
3 months ago
How DeepSeek is Pushing the Boundaries of AI Development
This week, we dive into DeepSeek. SallyAnn DeLucia, Product Manager at Arize, and Nick Luzio, a Solutions Engineer, break down key insights on a model that have dominating headlines for its significant breakthrough in inference speed over other models. What’s next for AI (and open source)? From training strategies to real-world performance, here’s what you need to know. Read a summary: https://arize.com/blog/how-deepseek-is-pushing-the-boundaries-of-ai-development/ Learn more about AI observa...
Deep Papers
Join us as we discuss Accurate KV Cache Quantization with Outlier Tokens Tracing, a deep dive into improving the efficiency of LLM inference. The authors enhance KV Cache quantization, a technique for reducing memory and compute costs during inference, by introducing a method to identify and exclude outlier tokens that hurt quantization accuracy, striking a better balance between efficiency and performance. Paper: https://arxiv.org/abs/2505.10938 Slides: https://bit.ly/45wolpr Join us for Ar...