Home
Categories
EXPLORE
True Crime
Comedy
Society & Culture
Business
News
Sports
TV & Film
About Us
Contact Us
Copyright
© 2024 PodJoint
Podjoint Logo
US
00:00 / 00:00
Sign in

or

Don't have an account?
Sign up
Forgot password
https://is1-ssl.mzstatic.com/image/thumb/Podcasts126/v4/d8/1a/22/d81a22e4-45ee-87ac-084e-fce8ec2be64f/mza_15480890702391362489.jpg/600x600bb.jpg
Snacks Weekly on Data Science
Pan Wu
111 episodes
2 days ago
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!
Show more...
Education
RSS
All content for Snacks Weekly on Data Science is the property of Pan Wu and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!
Show more...
Education
Episodes (20/111)
Snacks Weekly on Data Science
Personalizing Marketing with Uplift Modeling [Klaviyo]

In this episode, we explore how Klaviyo used counterfactual learning and uplift modeling to move beyond the question of which treatment works — to the deeper question of for whom it works. We’ll see how the team combined randomized experiments, causal inference techniques, and uplift modeling to power a product that helps marketers deliver smarter, more personalized messages.
For more details, you can refer to their published tech blog, linked here for your reference: https://klaviyo.tech/the-stats-that-tell-you-what-could-have-been-counterfactual-learning-and-uplift-modeling-e95d3b712d8a

Show more...
2 days ago
9 minutes 51 seconds

Snacks Weekly on Data Science
Quick History and Fun Facts About Halloween: Pumpkins, Candies, and Costumes

In this Halloween special episode, we explore some fun facts and surprising data behind these festive favorites: Did you know Illinois is the top pumpkin-producing state, harvesting nearly 40% of all pumpkins in the U.S.? Or that Reese’s Peanut Butter Cups consistently rank as America’s most popular Halloween candy? And that over — or at least — 20% of pet owners now dress up their pets for Halloween? Now, let’s dive into these facts and the history behind the holiday. Enjoy!

Show more...
1 week ago
7 minutes 2 seconds

Snacks Weekly on Data Science
Feed Ranking: From Batch Inference to Online Inference [Whatnot]

In this episode, we explore how Whatnot improved its feed ranking system by moving from batch predictions to online inference—enabling the platform to scale effectively while capturing real-time marketplace dynamics. This evolution reflects a broader shift in recommendation systems toward more adaptive, real-time personalization.

For more details, check out the full tech blog from the Whatnot engineering team: https://medium.com/whatnot-engineering/evolving-feed-ranking-at-whatnot-25adb116aeb6

Show more...
2 weeks ago
7 minutes 58 seconds

Snacks Weekly on Data Science
Self-serve Experimentation Tool for Marketing [Tripadvisor]

In this episode, we explore Tripadvisor’s self-serve experimentation platform for marketing. On the business side, the challenge was measuring campaign effectiveness in a messy, external environment where clean randomization isn’t always possible. On the technical side, the TripAdvisor team developed a system that applies causal inference techniques—particularly the difference-in-differences method—to deliver reliable estimates of campaign impact.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/tripadvisor/introducing-baldur-tripadvisors-self-serve-experimentation-tool-for-marketing-7fc9933b25cc

Show more...
3 weeks ago
10 minutes 42 seconds

Snacks Weekly on Data Science
Global Feature Importance with Collective Wisdom [Meta]

In this episode, we look at how Meta addressed the challenge of feature selection at scale through Global Feature Importance—a system that aggregates insights across models to surface the most valuable features. This approach not only streamlines model development but also enables machine learning engineers to iterate more effectively and build models that deliver stronger business impact.
For more details, check out Meta’s published tech blog here: https://medium.com/@AnalyticsAtMeta/collective-wisdom-of-models-advanced-feature-importance-techniques-at-meta-1a7a8d2f9e27

Show more...
1 month ago
8 minutes 25 seconds

Snacks Weekly on Data Science
Evaluating Retrieval Capabilities of Language Models [Microsoft]

In this episode, we explore how to evaluate the retrieval-augmented generation (RAG) capabilities of small language models. On the business side, we discuss why RAG, long context windows, and small language models are critical for building scalable and reliable AI systems. On the technical side, we walk through the Needle-in-a-Haystack methodology and discuss key findings about retrieval performance across different models.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/evaluating-rag-capabilities-of-small-language-models-e7531b3a5061

Show more...
1 month ago
10 minutes 1 second

Snacks Weekly on Data Science
Personalized Recommendation with Foundation Models [Netflix]

In this episode, we explore how Netflix enhanced recommendation personalization using foundation models. These models can process massive user histories through tokenization and attention mechanisms, while also addressing the cold-start problem with hybrid embeddings. The work highlights how principles from large language models can be adapted to build more effective recommendation systems at scale.

For more details, you can refer to their published tech blog, linked here for your reference: https://netflixtechblog.com/foundation-model-for-personalized-recommendation-1a0bd8e02d39

Show more...
1 month ago
11 minutes 37 seconds

Snacks Weekly on Data Science
A/B Testing vs. Multi-Armed Bandits: A Simulated Study [Vanguard]

In this episode, we explore how Vanguard evaluated standard A/B testing against multi-armed bandits for digital experimentation. Their simulated study showed that A/B testing is often the better choice when dealing with a small number of variations, while bandit strategies, such as Thompson Sampling, become more effective as the number of variations increases. The broader lesson is that experimentation design should always be context-aware—balancing simplicity, speed, and interpretability based on your business needs.


For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/vanguard-technology/smarter-web-wins-a-b-testing-vs-multi-armed-bandits-unpacked-7f5032358513

Show more...
1 month ago
10 minutes 33 seconds

Snacks Weekly on Data Science
Catalog Attribute Extraction with Multi-Modal LLMs [Instacart]

In this episode, we explore how Instacart tackled the challenge of extracting accurate product attributes at scale. We discuss different solutions—starting with SQL rules, moving to text-based ML models, and finally, Instacart’s multi-modal LLM platform, PARSE. By blending text and image data and enabling rapid configuration, PARSE demonstrates how modern AI tools can streamline data pipelines, reduce engineering overhead, and deliver better user experiences.


For more details, you can refer to their published tech blog, linked here for your reference: https://tech.instacart.com/multi-modal-catalog-attribute-extraction-platform-at-instacart-b9228754a527

Show more...
1 month ago
10 minutes 24 seconds

Snacks Weekly on Data Science
Segmenting Supply with a Data-Driven Methodology [Airbnb]

In this episode, we explore how Airbnb developed a structured framework that combines unsupervised clustering and supervised modeling to classify listings into meaningful supply personas based on availability patterns. This data-driven approach helps Airbnb enhance personalization, improve experimentation, and gain deeper insights into its global supply base.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/airbnb-engineering/from-data-to-insights-segmenting-airbnbs-supply-c88aa2bb9399

Show more...
2 months ago
8 minutes 17 seconds

Snacks Weekly on Data Science
Causal Inference with Bayesian Structural Time Series Model [Walmart]

In this episode, we explore the Bayesian Structural Time Series model as a causal inference methodology and walk through a real-world example of how Walmart leveraged it to measure the impact of a simple yet meaningful product taxonomy change.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/walmartglobaltech/decoding-causal-incrementality-in-e-commerce-leveraging-bayesian-structural-time-series-model-with-f7eaf7267d69

Show more...
2 months ago
8 minutes 38 seconds

Snacks Weekly on Data Science
Advancements in Embedding-Based Retrieval [Pinterest]

In this episode, we delve into how Pinterest has enhanced its embedding-based retrieval system to provide a more personalized, relevant, and dynamic Homefeed experience. By scaling their models with richer feature interactions, refreshing the content corpus with trending Pins, and leveraging cutting-edge machine learning techniques, Pinterest is able to serve better content—faster and more accurately—to hundreds of millions of users.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/pinterest-engineering/advancements-in-embedding-based-retrieval-at-pinterest-homefeed-d7d7971a409e

Show more...
2 months ago
10 minutes 25 seconds

Snacks Weekly on Data Science
How Data Scientists Lead and Drive Impact [Meta]

In this episode, we dive into what it’s like to be a data scientist at Meta. Grounded in product leadership, data scientists at Meta apply deep analytical expertise to drive measurement, navigate complex product ecosystems, and shape key decisions—ultimately delivering meaningful impact on product outcomes.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/@AnalyticsAtMeta/how-data-scientists-lead-and-drive-impact-at-meta-6b5b896821b2

Show more...
2 months ago
10 minutes 27 seconds

Snacks Weekly on Data Science
Building Scalable Risk Management Platform [Revolut]

In this episode, we explore how Revolut is reimagining risk management. By developing a modular, scientifically grounded, and explainable platform, the team has enabled faster, more accurate, and more transparent risk decisions—spanning diverse products and global markets.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/revolut/reinventing-risk-at-revolut-77e63c552503 

Show more...
3 months ago
10 minutes 12 seconds

Snacks Weekly on Data Science
Tackling Interference Bias with Marketplace Marginal Values [Lyft]

In this episode, we explore how Lyft tackles interference bias in marketplace experiments using Marketplace Marginal Values (MMVs). We break down why interference is a natural challenge in two-sided platforms like Lyft, and how their team uses optimization, simulation, and advanced metrics to measure causal effects more reliably.

For more details, check out the original tech blog linked here: https://eng.lyft.com/using-marketplace-marginal-values-to-address-interference-bias-a11aff6e670f

Show more...
3 months ago
9 minutes 51 seconds

Snacks Weekly on Data Science
Causal Inference with Double Machine Learning [Microsoft]

In this episode, we explore how causal inference helps companies like Microsoft answer high‑stakes product and business questions when A/B testing isn’t possible. We dive into Double Machine Learning—a technique that leverages ML models to control for confounding variables and isolate true causal effects. The result is a flexible, rigorous framework that every data scientist should have in their toolkit.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/data-science-at-microsoft/introduction-to-causal-inference-using-double-machine-learning-5daa642321f3

Show more...
3 months ago
8 minutes 13 seconds

Snacks Weekly on Data Science
Scalable and Blendable Feed Construction [Whatnot]

In this episode, we explore how Whatnot tackled the challenge of scaling feed recommendation systems across a rapidly growing platform. We dive into WhataMix—a DAG-based framework that enables teams to build, test, and deploy feed logic using reusable, modular components. It’s a great example of how thoughtful system design can accelerate development while maintaining high standards in machine learning infrastructure.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/whatnot-engineering/whatamix-blendable-feed-construction-2c94c21f6635 

Show more...
3 months ago
8 minutes 49 seconds

Snacks Weekly on Data Science
Using Generative and Traditional AI to Enhance Travel Experience [Expedia]

In this episode, we explore how Expedia is integrating both generative and traditional AI to enhance the travel experience. The company’s approach leverages generative models for open-ended, natural language tasks, and relies on traditional models for structured, mission-critical problems. By playing to the strengths of each, Expedia is able to build smarter, more adaptable AI systems without overcomplicating things or compromising on performance.


For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/expedia-group-tech/elevating-travel-experiences-with-ai-acdb2cf2ec13

Show more...
4 months ago
9 minutes 41 seconds

Snacks Weekly on Data Science
Ensuring Data Quality at Petabyte Scale [Glassdoor]

In this episode, we dive into how Glassdoor addresses the challenge of maintaining data quality at a petabyte scale. By treating data as a product, the engineering team built a centralized, scalable platform that enables proactive validation, continuous monitoring, and cross-team collaboration. From data contracts and static code analysis to LLM-based logic checks and anomaly detection, we unpack the key practices behind their approach.

For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/glassdoor-engineering/data-quality-at-petabyte-scale-building-trust-in-the-data-lifecycle-7052361307a4

Show more...
4 months ago
11 minutes 50 seconds

Snacks Weekly on Data Science
Building a Travel Assistant with LLMs [Agoda]

In this episode, we explore how Agoda used large language models (LLMs) to improve user experience through building a conversational AI product. By focusing on prompt engineering, grounding data, and smart evaluation, the team built a scalable assistant that adds real value to the user journey.
For more details, you can refer to their published tech blog, linked here for your reference: https://medium.com/agoda-engineering/how-we-built-agodas-property-ama-bot-to-simplify-travel-decisions-b861c7ec7ff1

Show more...
4 months ago
8 minutes 10 seconds

Snacks Weekly on Data Science
This podcast is about making data science and machine learning knowledge accessible and less intimidating. Every week, I will handpick one selected industrial tech blog to break it down. We will discuss some key data science concepts and machine learning algorithms, and how they are applied in those real-world applications. Subscribe to the channel and enjoy Snacks Weekly on Data Science!