Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/f0/4a/f8/f04af822-008a-2330-e3f3-5fae4e00262c/mza_6620006532835236257.jpg/600x600bb.jpg
The Gist Talk
kw
237 episodes
3 days ago
Welcome to The Gist Talk, the podcast where we break down the big ideas from the world’s most fascinating business and non-fiction books. Whether you’re a busy professional, a lifelong learner, or just someone curious about the latest insights shaping the world, this show is for you. Each episode, we’ll explore the key takeaways, actionable lessons, and inspiring stories—giving you the ‘gist’ of every book, one conversation at a time. Join us for engaging discussions that make learning effortless and fun.
Show more...
Business
RSS
All content for The Gist Talk is the property of kw and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
Welcome to The Gist Talk, the podcast where we break down the big ideas from the world’s most fascinating business and non-fiction books. Whether you’re a busy professional, a lifelong learner, or just someone curious about the latest insights shaping the world, this show is for you. Each episode, we’ll explore the key takeaways, actionable lessons, and inspiring stories—giving you the ‘gist’ of every book, one conversation at a time. Join us for engaging discussions that make learning effortless and fun.
Show more...
Business
https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/42551424/42551424-1732839355363-f882e4dafe46.jpg
DeepSeek Deployment with SGLang: Disaggregation and Expert Parallelism
The Gist Talk
51 minutes 58 seconds
3 weeks ago
DeepSeek Deployment with SGLang: Disaggregation and Expert Parallelism

This episode is based on a technical blog post from LMSYS Org detailing the deployment of the DeepSeek large language model (LLM) using the SGLang inference system on 96 H100 GPUs. The central focus is on advanced optimization techniques, specifically Prefill-Decode (PD) Disaggregation and Large-Scale Expert Parallelism (EP), which are necessary to efficiently serve DeepSeek's complex Mixture of Experts (MoE) architecture. The authors explain how their implementation, which includes toolkits like Disposable Tensor and the Expert Parallelism Load Balancer (EPLB), achieves throughput performance nearly matching the official DeepSeek profile while significantly reducing costs. Through extensive evaluation, they demonstrate substantial speedups over vanilla tensor parallelism, discuss detailed kernel breakdowns, and outline future work to address latency and scalability limitations

The Gist Talk
Welcome to The Gist Talk, the podcast where we break down the big ideas from the world’s most fascinating business and non-fiction books. Whether you’re a busy professional, a lifelong learner, or just someone curious about the latest insights shaping the world, this show is for you. Each episode, we’ll explore the key takeaways, actionable lessons, and inspiring stories—giving you the ‘gist’ of every book, one conversation at a time. Join us for engaging discussions that make learning effortless and fun.