Data Science #28 - The Bloom filter algorithm

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/75/db/da/75dbda7a-9c02-a923-c9a8-ac50c4a94f59/mza_7528638632772517919.jpg/600x600bb.jpg

Data Science Decoded

Mike E

32 episodes

1 month ago

We discuss seminal mathematical papers (sometimes really old 😎 ) that have shaped and established the fields of machine learning and data science as we know them today. The goal of the podcast is to introduce you to the evolution of these fields from a mathematical and slightly philosophical perspective. We will discuss the contribution of these papers, not just from pure a math aspect but also how they influenced the discourse in the field, which areas were opened up as a result, and so on. Our podcast episodes are also available on our youtube: https://youtu.be/wThcXx_vXjQ?si=vnMfs

Mathematics

Science

RSS

All content for Data Science Decoded is the property of Mike E and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Mathematics

Science

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/41505637/41505637-1720347263425-80b9b83d77589.jpg

Data Science #28 - The Bloom filter algorithm

Data Science Decoded

39 minutes 15 seconds

6 months ago

Data Science #28 - The Bloom filter algorithm

In the 28th episode, we go over Burton Bloom's Bloom filter from 1970, a groundbreaking data structure that enables fast, space-efficient set membership checks by allowing a small, controllable rate of false positives.Unlike traditional methods that store full data, Bloom filters use a compact bit array and multiple hash functions, trading exactness for speed and memory savings.

This idea transformed modern data science and big data systems, powering tools like Apache Spark, Cassandra, and Kafka, where fast filtering and memory efficiency are critical for performance at scale.