Code Impact

EXPLORE

Society & Culture

Health & Fitness

© 2024 PodJoint

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/67/c8/43/67c8430f-7e3e-bcba-4469-54787206cb90/mza_12433984773424099588.jpg/600x600bb.jpg

Code Impact

Sanket Makhija

72 episodes

4 days ago

Welcome to "Code Impact," the podcast where we explore code that has an impact. Each episode dives deep into real-world stories, practical case studies, and expert insights, showcasing the powerful impact of code on performance, accessibility, and user experience. Whether you're a seasoned developer or just starting your journey, "Code Impact" delivers the tools, tips, and inspiration you need to create meaningful and high-performing products. Join us as we uncover the ways coding is transforming industries and making a difference—one line at a time. NotebookLM creates all episodes.

Show more...

All content for Code Impact is the property of Sanket Makhija and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Welcome to "Code Impact," the podcast where we explore code that has an impact. Each episode dives deep into real-world stories, practical case studies, and expert insights, showcasing the powerful impact of code on performance, accessibility, and user experience. Whether you're a seasoned developer or just starting your journey, "Code Impact" delivers the tools, tips, and inspiration you need to create meaningful and high-performing products. Join us as we uncover the ways coding is transforming industries and making a difference—one line at a time. NotebookLM creates all episodes.

Show more...

Episodes (20/72)

Code Impact

Spanner's Globally-Distributed Database: Query Execution

This paper details the evolution of Google's Spanner, a globally-distributed database system, from a key-value store to a fully-fledged SQL system. Key improvements discussed include distributed query execution, handling of transient failures via query restarts, efficient range extraction for data retrieval, and the adoption of a common SQL dialect. The authors also explain the transition from a Bigtable-like storage format to a more efficient blockwise-columnar store (Ressi). Finally, the paper highlights lessons learned during Spanner's large-scale deployment and outlines remaining challenges.

9 months ago

36 minutes 24 seconds

Code Impact

Change Data Capture (CDC): Three Implementation Methods

The article explores Change Data Capture (CDC), a method for tracking database changes, highlighting its advantages over traditional daily snapshots. It details three CDC implementation approaches: using database triggers (e.g., in PostgreSQL), capturing API requests and using a message broker (e.g., Kafka), and leveraging change streams within a data warehouse (e.g., Snowflake). The article compares these methods, weighing their pros and cons in terms of performance, scalability, and ease of implementation. A subsequent discussion critiques the presented methods, suggesting alternative, more robust solutions based on logical replication tools like Debezium.

9 months ago

16 minutes 6 seconds

Code Impact

DeepSeek-R1: Reasoning via Reinforcement Learning

This research paper introduces DeepSeek-R1, a large language model enhanced for reasoning capabilities using reinforcement learning (RL). Two versions are presented: DeepSeek-R1-Zero, trained purely via RL without supervised fine-tuning, and DeepSeek-R1, which incorporates additional multi-stage training and cold-start data for improved readability and performance. DeepSeek-R1 achieves results comparable to OpenAI's o1-1217 on various reasoning benchmarks. The study also explores distilling DeepSeek-R1's reasoning capabilities into smaller, more efficient models, achieving state-of-the-art results. Finally, the paper discusses unsuccessful attempts using process reward models and Monte Carlo Tree Search, providing valuable insights for future research.

https://github.com/deepseek-ai/DeepSeek-R1/blob/main/DeepSeek_R1.pdf

9 months ago

19 minutes 29 seconds

Code Impact

Jira Cloud Performance Enhancement with Protobuf

This Atlassian blog post details the migration of Jira Cloud's Issue Service from JSON to Protocol Buffers (Protobuf) to enhance performance. The switch involved a phased approach to minimise downtime, creating new endpoints and logic to handle both formats concurrently before a complete transition. The results showcased significant improvements: 75% less Memcached CPU usage, 80% smaller data size, and a substantially faster response time. Challenges encountered included Protobuf's handling of null values and incompatibility with Spring's default error controller, which required workarounds. Ultimately, the migration yielded substantial performance gains and reduced infrastructure needs.

https://www.atlassian.com/blog/atlassian-engineering/using-protobuf-to-make-jira-cloud-faster

9 months ago

20 minutes 11 seconds

Code Impact

Hyaline: Fast and Transparent Lock-Free Memory Reclamation

This research paper introduces Hyaline, a novel family of memory reclamation schemes for lock-free data structures in unmanaged C/C++ code. Hyaline leverages reference counting, but only during reclamation, minimising overhead during object access and balancing workload across threads. The paper details Hyaline's design, including a scalable multi-list version and robust extensions to handle stalled threads. Extensive testing across multiple architectures demonstrates Hyaline's superior performance and memory efficiency compared to existing schemes like epoch-based reclamation and hazard pointers, particularly in read-dominated and oversubscribed scenarios. The paper concludes by proving Hyaline's correctness and lock-freedom properties.

9 months ago

32 minutes 35 seconds

Code Impact

Trello's Kafka Migration

This Atlassian blog post details Trello's migration from RabbitMQ to Kafka for its websocket architecture. RabbitMQ's unreliability during network partitions and high costs associated with queue creation and deletion prompted the switch. The article compares various queuing systems, highlighting Kafka's superior failover capabilities and in-order message delivery. Trello implemented a master-client architecture with Kafka, resulting in improved performance, reduced costs, and fewer outages. Key performance improvements included a 33% decrease in memory usage and a substantial cost reduction.

9 months ago

12 minutes 13 seconds

Code Impact

Reliability Engineering: History, Practice, and Future

This podcast explores the field of reliability engineering, tracing its origins at Google with the development of Site Reliability Engineering (SRE). It differentiates reliability engineering from SRE, highlighting its broader applicability across various organisational structures. The podcast outlines four key promises of a successful reliability team: defining service levels (SLA/SLO/SLI), managing the service infrastructure, participating in technical design, and providing tactical support during incidents. Finally, it discusses the evolving landscape of reliability engineering, emphasising pragmatic approaches to balancing cost and reliability needs, and advocating for a more nuanced understanding of when to build versus buy solutions.

9 months ago

13 minutes 50 seconds

Code Impact

Debugging Large Distributed Systems: The Antithesis Approach

This podcast profiles Antithesis, a company developing a "multiverse debugger" for large, distributed systems. It traces the history of debugging tools, highlighting Antithesis's innovative approach using deterministic simulation testing (DST) to allow time travel debugging. The podcast includes a Q&A with Antithesis's co-founder, detailing the challenges of debugging large systems and how Antithesis addresses them. Furthermore, it discusses Antithesis's tech stack, engineering culture, and the trade-offs of using their complex, but potentially game-changing, technology. Finally, it considers the implications of widespread adoption of Antithesis's technology for the future of software development.

9 months ago

23 minutes 45 seconds

Code Impact

Shopify's Live Globe: Building a Black Friday Experience

This podcast details the creation of Shopify's interactive Black Friday/Cyber Monday live dashboard, nicknamed "Live Globe". The 2024 version, built by a six-person team in two months, features a spaceship-themed interface showcasing real-time sales data and boasts impressive technical specifications, including peak loads of nearly 30 million database reads per second. The design process involved extensive prototyping and the use of AI-generated imagery for inspiration. The podcast also highlights the technology stack (React Three Fiber, Go, Rails, Kafka, and Flink), the inclusion of numerous Easter eggs, and the challenges of performance optimisation and real-time data streaming. Finally, it explores the project's unique approach to ROI, prioritising fun and innovation.

9 months ago

20 minutes 56 seconds

Code Impact

Wartime vs. Peacetime in Tech Companies

This podcast examines the contrasting "wartime" and "peacetime" operating modes in tech companies, drawing on the author's experiences at Uber and observations across the industry. It defines these modes in terms of leadership styles, employee behaviours, and organisational priorities, highlighting the differences in approaches to project management, performance reviews, and tech debt. The text explores the transitions between these modes, identifying common triggers and observable signs, and offers advice for employees and managers on thriving in each environment. Finally, it discusses the counterintuitive relationship between extended "wartime" periods and tech debt accumulation.

9 months ago

18 minutes 25 seconds

Code Impact

The First-Time Manager: A Practical Guide

Jim McCormick's "The First Time Manager" offers a practical guide for new managers, covering essential aspects like communication, delegation, and conflict resolution. The book employs a clear and relatable style, using real-world examples and actionable advice to help readers build foundational leadership skills. While some advice may be general, its comprehensive approach to fundamental management principles makes it a valuable resource for aspiring and new managers seeking a strong start in their careers. The book also touches on crucial aspects of personal development and emotional intelligence in leadership. Even experienced managers might find its refresher on core concepts beneficial.

10 months ago

28 minutes 5 seconds

Code Impact

On-the-Fly Sharing for Streamed Aggregation

This research paper details the development and implementation of efficient techniques for processing multiple, similar aggregate queries in data streaming systems. The authors address the challenges of scaling to handle hundreds of concurrent queries, each with potentially different time windows and selection predicates. Their proposed "on-the-fly" methods avoid computationally expensive static query analysis, offering significant performance improvements (up to an order of magnitude) over existing approaches. The techniques are validated through a performance study using real-world stock market data, demonstrating their practical effectiveness. The core contributions are novel algorithms for shared time slices, shared data fragments, and a combined approach called shared data shards.

10 months ago

14 minutes 47 seconds

Code Impact

Vercel Request Lifecycle From User Input to Global Delivery

The article details how Vercel's platform handles web requests, from initial user input to final response. Vercel's Edge Network directs requests to optimal data centres, minimising latency. A multi-layered firewall system protects against threats. Advanced routing features, including middleware, manage request flow. Finally, Edge caching and Vercel Functions optimise speed and scalability for dynamic content.

10 months ago

11 minutes 13 seconds

Code Impact

Bytedance Real-Time Recommendation System

This research paper introduces Monolith, a real-time recommendation system designed by Bytedance. Addressing limitations of existing deep learning frameworks, Monolith uses a novel collisionless embedding table to efficiently handle sparse, dynamic features, significantly improving model quality and memory usage. A key innovation is its online training architecture, enabling real-time model updates based on user feedback. The authors demonstrate Monolith’s superior performance through experiments and A/B tests, highlighting the trade-offs between real-time learning and system reliability. Finally, the paper compares Monolith to existing solutions, showcasing its advantages in scalability and efficiency for large-scale recommendation tasks.

10 months ago

15 minutes 52 seconds

Code Impact

Postgres Retrospective by Joseph M. Hellerstein

This article reminisces on the history of the Postgres project, spearheaded by Michael Stonebraker at UC Berkeley from the mid-1980s to the mid-1990s. It details Stonebraker's design philosophy and the project's technical innovations, including support for abstract data types, active databases, and novel storage and recovery mechanisms. The article highlights Postgres's evolution into the open-source PostgreSQL system, its significant commercial impact through various spin-off companies, and the lessons learned from its success. It also discusses the unexpected benefits of open-sourcing the research and the project's lasting influence on database technology. The author reflects on his own involvement and contributions to the project.

10 months ago

21 minutes 59 seconds

Code Impact

Migrating Yelp Reservations from PostgreSQL to MySQL

This blog post details Yelp's in-place migration of their Yelp Reservations service database from PostgreSQL to MySQL. The migration, necessitated by maintenance and expertise limitations with PostgreSQL, involved significant code refactoring to address unsupported features and ensure data consistency. A gradual rollout strategy, employing multi-DB support and careful synchronisation, was implemented to minimise disruption. The process revealed several unexpected challenges, including issues with auto-incrementing keys and ProxySQL memory usage, highlighting the complexities of such large-scale database migrations. Ultimately, the switch to the company standard MySQL improved performance and maintainability.

10 months ago

17 minutes 41 seconds

Code Impact

FBDetect - Catching Tiny Performance Regressions at Hyperscale through In-Production Monitoring

Meta's FBDetect system, detailed in this research paper, is a robust, in-production performance regression detection system. It identifies minuscule performance regressions (as small as 0.005%) across millions of servers and hundreds of services by monitoring hundreds of thousands of time series metrics. Key to FBDetect's success are advanced techniques for subroutine-level performance analysis, filtering false positives, deduplicating correlated regressions, and root cause analysis. The paper validates FBDetect's effectiveness through simulations and real-world production data, showcasing its superiority over existing methods and highlighting the significance of its seven years of successful operation.

10 months ago

20 minutes 39 seconds

Code Impact

Amazon DynamoDB - A Decade of Scalable NoSQL

This paper details the architecture and evolution of Amazon DynamoDB, a fully managed NoSQL database service. Key features highlighted include its scalability, predictable performance, high availability (achieved through multi-region replication and sophisticated failure handling), and strong durability (guaranteed by techniques like write-ahead logging and continuous data verification). The authors discuss challenges faced during DynamoDB's development, such as handling uneven traffic distribution and optimising resource allocation, and explain the solutions implemented, including the shift from provisioned to on-demand capacity. Performance benchmarks are provided to demonstrate the system's consistent low latency even under extreme load.

10 months ago

21 minutes 20 seconds

Code Impact

Defining a Senior Software Engineer

This blog post discusses the multifaceted definition of a senior software engineer. Technical expertise is crucial, encompassing a T-shaped skill profile and a deep understanding of software development principles. However, soft skills, such as communication, leadership, and a growth mindset, are equally vital for moving projects and teams forward. The author suggests several strategies for professional growth, including pair programming, content creation, and seeking challenging tasks. Ultimately, the article posits that becoming a senior engineer is an ongoing journey of learning and improvement, rather than a fixed destination.

10 months ago

23 minutes 29 seconds

Code Impact

Amazon S3 Tables - Analytics Optimised Storage

Amazon Web Services (AWS) has launched Amazon S3 Tables, a new storage service optimised for analytical workloads. These tables, stored in a new type of S3 bucket, utilise the Apache Iceberg format for efficient querying with tools like Amazon Athena and Apache Spark. Offering significant performance improvements (up to 3x faster queries and 10x more transactions per second) over self-managed solutions, S3 Tables provide fully managed features including automatic compaction, snapshot management, and unreferenced file removal. The service integrates with other AWS analytics services and supports standard S3 APIs, offering enhanced security and scalability. Currently available in select US regions, S3 Tables are designed to streamline large-scale data analytics.

10 months ago

14 minutes 17 seconds