The MAD Podcast with Matt Turck

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/e7/b1/57/e7b15773-eda8-8232-a9fc-3b9c12f5d686/mza_12506514133425101632.jpg/600x600bb.jpg

Matt Turck

99 episodes

1 hour ago

The MAD Podcast with Matt Turck, is a series of conversations with leaders from across the Machine Learning, AI, & Data landscape hosted by leading AI & data investor and Partner at FirstMark Capital, Matt Turck.

Technology

RSS

All content for The MAD Podcast with Matt Turck is the property of Matt Turck and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Technology

Episodes (20/99)

The MAD Podcast with Matt Turck

State of AI 2025 with Nathan Benaich: Power Deals, Reasoning Breakthroughs, Real Revenue

Power is the new bottleneck, reasoning got real, and the business finally caught up. In this wide-ranging conversation, I sit down with Nathan Benaich, Founder and General Partner at Air Street Capital, to discuss the newly published 2025 State of AI report—what’s actually working, what’s hype, and where the next edge will come from. We start at the physical layer: energy procurement, PPAs, off-grid builds, and why water and grid constraints are turning power—not GPUs—into the decisive moat.

From there, we move into capability: reasoning models acting as AI co-scientists in verifiable domains, and the “chain-of-action” shift in robotics that’s taking us from polished demos to dependable deployments. Along the way, we examine the market reality—who’s making real revenue, how margins actually behave once tokens and inference meet pricing, and what all of this means for builders and investors.

We also zoom out to the ecosystem: NVIDIA’s position vs. custom silicon, China’s split stack, and the rise of sovereign AI (and the “sovereignty washing” that comes with it). The policy and security picture gets a hard look too—regulation’s vibe shift, data-rights realpolitik, and what agents and MCP mean for cyber risk and adoption.

Nathan closes with where he’s placing bets (bio, defense, robotics, voice) and three predictions for the next 12 months.

Nathan Benaich

Blog - https://www.nathanbenaich.com

X/Twitter - https://x.com/nathanbenaich

Source: State of AI Report 2025 (9/10/2025)

Air Street Capital

Website - https://www.airstreet.com

X/Twitter - https://x.com/airstreet

Matt Turck (Managing Director)

Blog - https://www.mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

(0:00) – Cold Open: “Gargantuan money, real reasoning”

(0:40) – Intro: State of AI 2025 with Nathan Benaich

(02:06) – Reasoning got real: from chain-of-thought to verified math wins

(04:11) – AI co-scientist: hypotheses, wet-lab validation, fewer “dumb stochastic parrots”

(04:44) – Chain-of-action robotics: plan → act you can audit

(05:13) – Humanoids vs. warehouse reality: where robots actually stick first

(06:32) – The business caught up: who’s making real revenue now

(08:26) – Adoption & spend: Ramp stats, retention, and the shadow-AI gap

(11:00) – Margins debate: tokens, pricing, and the thin-wrapper trap

(14:02) – Bubble or boom? Wall Street vs. SF vibes (and circular deals)

(19:54) – Power is the bottleneck: $50B/GW capex and the new moat

(21:02) – PPAs, gas turbines, and off-grid builds: the procurement game

(23:54) – Water, grids, and NIMBY: sustainability gets political

(25:08) – NVIDIA’s moat: 90% of papers, Broadcom/AMD, and custom silicon

(28:47) – China split-stack: Huawei, Cambricon, and export zigzags

(30:30) – Sovereign AI or “sovereignty washing”? Open source as leverage

(40:40) – Regulation & safety: from Bletchley to “AI Action”—the vibe shift

(44:06) – Safety budgets vs. lab spend; models that game evals

(44:46) – Data rights realpolitik: $1.5B signals the new training cost

(47:04) – Cyber risk in the agent era: MCP, malware LMs, state actors

(50:19) – Agents that convert: search → commerce and the demo flywheel

(54:18) – VC lens: where Nathan is investing (bio, defense, robotics, voice)

(68:29) – Predictions: power politics, AI neutrality, end-to-end discoveries

(1:02:13) – Wrap: what to watch next & where to find the report (stateof.ai)

3 days ago

1 hour 3 minutes 15 seconds

The MAD Podcast with Matt Turck

Are We Misreading the AI Exponential? Julian Schrittwieser on Move 37 & Scaling RL (Anthropic)

Are we failing to understand the exponential, again?

My guest is Julian Schrittwieser (top AI researcher at Anthropic; previously Google DeepMind on AlphaGo Zero & MuZero). We unpack his viral post (“Failing to Understand the Exponential, again”) and what it looks like when task length doubles every 3–4 months—pointing to AI agents that can work a full day autonomously by 2026 and expert-level breadth by 2027. We talk about the original Move 37 moment and whether today’s AI models can spark alien insights in code, math, and science—including Julian’s timeline for when AI could produce Nobel-level breakthroughs.

We go deep on the recipe of the moment—pre-training + RL—why it took time to combine them, what “RL from scratch” gets right and wrong, and how implicit world models show up in LLM agents. Julian explains the current rewards frontier (human prefs, rubrics, RLVR, process rewards), what we know about compute & scaling for RL, and why most builders should start with tools + prompts before considering RL-as-a-service. We also cover evals & Goodhart’s law (e.g., GDP-Val vs real usage), the latest in mechanistic interpretability (think “Golden Gate Claude”), and how safety & alignment actually surface in Anthropic’s launch process.

Finally, we zoom out: what 10× knowledge-work productivity could unlock across medicine, energy, and materials, how jobs adapt (complementarity over 1-for-1 replacement), and why the near term is likely a smooth ramp—fast, but not a discontinuity.

Julian Schrittwieser

Blog - https://www.julian.ac

X/Twitter - https://x.com/mononofu

Viral post: Failing to understand the exponential, again (9/27/2025)

Anthropic

Website - https://www.anthropic.com

X/Twitter - https://x.com/anthropicai

Matt Turck (Managing Director)

Blog - https://www.mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

(00:00) Cold open — “We’re not seeing any slowdown.”

(00:32) Intro — who Julian is & what we cover

(01:09) The “exponential” from inside frontier labs

(04:46) 2026–2027: agents that work a full day; expert-level breadth

(08:58) Benchmarks vs reality: long-horizon work, GDP-Val, user value

(10:26) Move 37 — what actually happened and why it mattered

(13:55) Novel science: AlphaCode/AlphaTensor → when does AI earn a Nobel?

(16:25) Discontinuity vs smooth progress (and warning signs)

(19:08) Does pre-training + RL get us there? (AGI debates aside)

(20:55) Sutton’s “RL from scratch”? Julian’s take

(23:03) Julian’s path: Google → DeepMind → Anthropic

(26:45) AlphaGo (learn + search) in plain English

(30:16) AlphaGo Zero (no human data)

(31:00) AlphaZero (one algorithm: Go, chess, shogi)

(31:46) MuZero (planning with a learned world model)

(33:23) Lessons for today’s agents: search + learning at scale

(34:57) Do LLMs already have implicit world models?

(39:02) Why RL on LLMs took time (stability, feedback loops)

(41:43) Compute & scaling for RL — what we see so far

(42:35) Rewards frontier: human prefs, rubrics, RLVR, process rewards

(44:36) RL training data & the “flywheel” (and why quality matters)

(48:02) RL & Agents 101 — why RL unlocks robustness

(50:51) Should builders use RL-as-a-service? Or just tools + prompts?

(52:18) What’s missing for dependable agents (capability vs engineering)

(53:51) Evals & Goodhart — internal vs external benchmarks

(57:35) Mechanistic interpretability & “Golden Gate Claude”

(1:00:03) Safety & alignment at Anthropic — how it shows up in practice

(1:03:48) Jobs: human–AI complementarity (comparative advantage)

(1:06:33) Inequality, policy, and the case for 10× productivity → abundance

(1:09:24) Closing thoughts

1 week ago

1 hour 9 minutes 56 seconds

The MAD Podcast with Matt Turck

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek

What does it really mean when GPT-5 “thinks”? In this conversation, OpenAI’s VP of Research Jerry Tworek explains how modern reasoning models work in practice—why pretraining and reinforcement learning (RL/RLHF) are both essential, what that on-screen “thinking” actually does, and when extra test-time compute helps (or doesn’t). We trace the evolution from O1 (a tech demo good at puzzles) to O3 (the tool-use shift) to GPT-5 (Jerry calls it “03.1-ish”), and talk through verifiers, reward design, and the real trade-offs behind “auto” reasoning modes.

We also go inside OpenAI: how research is organized, why collaboration is unusually transparent, and how the company ships fast without losing rigor. Jerry shares the backstory on competitive-programming results like ICPC, what they signal (and what they don’t), and where agents and tool use are genuinely useful today. Finally, we zoom out: could pretraining + RL be the path to AGI?

This is the MAD Podcast —AI for the 99%. If you’re curious about how these systems actually work (without needing a PhD), this episode is your map to the current AI frontier.

OpenAI

Website - https://openai.com

X/Twitter - https://x.com/OpenAI

Jerry Tworek

LinkedIn - https://www.linkedin.com/in/jerry-tworek-b5b9aa56

X/Twitter - https://x.com/millionint

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:01) What Reasoning Actually Means in AI

(02:32) Chain of Thought: Models Thinking in Words

(05:25) How Models Decide Thinking Time

(07:24) Evolution from O1 to O3 to GPT-5

(11:00) Before OpenAI: Growing up in Poland, Dropping out of School, Trading

(20:32) Working on Robotics and Rubik's Cube Solving

(23:02) A Day in the Life: Talking to Researchers

(24:06) How Research Priorities Are Determined

(26:53) Collaboration vs IP Protection at OpenAI

(29:32) Shipping Fast While Doing Deep Research

(31:52) Using OpenAI's Own Tools Daily

(32:43) Pre-Training Plus RL: The Modern AI Stack

(35:10) Reinforcement Learning 101: Training Dogs

(40:17) The Evolution of Deep Reinforcement Learning

(42:09) When GPT-4 Seemed Underwhelming at First

(45:39) How RLHF Made GPT-4 Actually Useful

(48:02) Unsupervised vs Supervised Learning

(49:59) GRPO and How DeepSeek Accelerated US Research

(53:05) What It Takes to Scale Reinforcement Learning

(55:36) Agentic AI and Long-Horizon Thinking

(59:19) Alignment as an RL Problem

(1:01:11) Winning ICPC World Finals Without Specific Training

(1:05:53) Applying RL Beyond Math and Coding

(1:09:15) The Path from Here to AGI

(1:12:23) Pure RL vs Language Models

2 weeks ago

1 hour 16 minutes 4 seconds

The MAD Podcast with Matt Turck

Sonnet 4.5 & the AI Plateau Myth — Sholto Douglas (Anthropic)

Sholto Douglas, a top AI researcher at Anthropic, discusses the breakthroughs behind Claude Sonnet 4.5—the world's leading coding model—and why we might be just 2-3 years from AI matching human-level performance on most computer-facing tasks.

You'll discover why RL on language models suddenly started working in 2024, how agents maintain coherency across 30-hour coding sessions through self-correction and memory systems, and why the "bitter lesson" of scale keeps proving clever priors wrong.

Sholto shares his path from top-50 world fencer to Google's Gemini team to Anthropic, explaining why great blog posts sometimes matter more than PhDs in AI research. He discusses the culture at big AI labs and why Anthropic is laser-focused on coding (it's the fastest path to both economic impact and AI-assisted AI research). Sholto also discusses how the training pipeline is still "held together by duct tape" with massive room to improve, and why every benchmark created shows continuous rapid progress with no plateau in sight.

Bold predictions: individuals will soon manage teams of AI agents working 24/7, robotics is about to experience coding-level breakthroughs, and policymakers should urgently track AI progress on real economic tasks. A clear-eyed look at where AI stands today and where it's headed in the next few years.

Anthropic

Website - https://www.anthropic.com

Twitter - https://x.com/AnthropicAI

Sholto Douglas

LinkedIn - https://www.linkedin.com/in/sholto

Twitter - https://x.com/_sholtodouglas

FIRSTMARK

Website - https://firstmark.com

Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:09) The Rapid Pace of AI Releases at Anthropic

(02:49) Understanding Opus, Sonnet, and Haiku Model Tiers

(04:14) Shelto's Journey: From Australian Fencer to AI Researcher

(12:01) The Growing Pool of AI Talent

(16:16) Breaking Into AI Research Without Traditional Credentials

(18:29) What "Taste" Means in AI Research

(23:05) Moving to Google and Building Gemini's Inference Stack

(25:08) How Anthropic Differs from Other AI Labs

(31:46) Why Anthropic Is Laser-Focused on Coding

(36:40) Inside a 30-Hour Autonomous Coding Session

(38:41) Examples of What AI Can Build in 30 Hours

(43:13) The Breakthroughs That Enabled 30-Hour Runs

(46:28) What's Actually Driving the Performance Gains

(47:42) Pre-Training vs. Reinforcement Learning Explained

(52:11) Test-Time Compute and the New Scaling Paradigm

(55:55) Why RL on LLMs Finally Started Working

(59:38) Are We on Track to AGI?

(01:02:05) Why the "Plateau" Narrative Is Wrong

(01:03:41) Sonnet's Performance Across Economic Sectors

(01:05:47) Preparing for a World of 10–100x Individual Leverage

1 month ago

1 hour 10 minutes 3 seconds

The MAD Podcast with Matt Turck

Goodbye Excel? AI Agents for Self-Driving Finance – Pigment CEO

The most successful enterprises are about to become autonomous — and Eléonore Crespo, Co-CEO of Pigment, is building the nervous system that makes it possible. In this conversation, Eléonore reveals how her $400 million AI platform is already running supply chains for Coca-Cola, powering finance for the hottest newly public companies like Figma and Klarna, and processing thousands of financial scenarios for Uber and Snowflake faster and more accurately than any human team ever could.

Eléonore predicts Excel will outlive most AI companies (but maybe only as a user interface, not a calculation engine) explains why she deliberately chose to build from Paris instead of Silicon Valley, and shares her contrarian take on why the AI revolution will create more CFOs, not fewer.

You'll discover why Pigment's three-agent system (Analyst, Modeler, Planner) avoids the hallucination problems plaguing other AI companies, how they achieved human-level accuracy in financial analysis, and the accelerating timeline for fully autonomous enterprise planning that will make your current workforce obsolete.

Pigment

Website - https://www.pigment.com

Twitter - https://x.com/gopigment

Eléonore Crespo

LinkedIn - linkedin.com/in/eleonorecrespo

FIRSTMARK

Website - https://firstmark.com

Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:22) Building Pigment: 500 Employees, $400M Raised, 60% US Revenue

(03:20) From Quantum Physics to Google to Index Ventures

(06:56) Why Being a VC Was the Perfect Founder Training Ground

(11:35) The Impatience Factor: What Makes Great Founders

(13:27) Hiring for AI Fluency in the Modern Enterprise

(14:54) Pigment's Internal AI Strategy: Committees and Guardrails

(17:30) The Three AI Agents: Analyst, Modeler, and Planner

(22:15) Why Three Agents Instead of One: Technical Architecture

(24:10) Agent Coordination: How the Supervisor Agent Works

(24:46) Real Example: Budget Variance Analysis Across 50 Products

(27:15) The Human-in-the-Loop Approach: Recommendations Not Actions

(27:36) Solving Hallucination: Why Structured Data Changes Everything

(30:08) Behind the Scenes: Verification Agents and Audit Trails

(31:57) Beyond Accuracy: Enabling the Impossible at Scale

(36:21) Will AI Finally Kill Excel? Eleanor's Contrarian Take

(38:23) The Vision: Fully Autonomous Enterprise Planning

(40:55) Real-Time Supply Chain Adaptation: The Ukraine Example

(42:20) Multi-LLM Strategy: OpenAI, Anthropic, and Partner Integration

(44:32) Token Economics: Why Pigment Isn't Token-Intensive

(48:30) Customer Adoption: Excitement vs. Change Management Challenges

(50:51) Top-Down AI Demand vs. Bottom-Up Implementation Reality

(53:08) The Reskilling Challenge: Everyone Becomes a Mini CFO

(57:38) Building a Global Company from Europe During COVID

(01:00:02) Managing a US Executive Team from Paris

(01:01:14) SI Partner Strategy: Why Boutique Firms Come Before Deloitte

(01:03:28) The $100 Billion Vision: Beyond Performance Management

(01:05:08) Success Metrics: Innovation Over Revenue

1 month ago

1 hour 5 minutes 46 seconds

The MAD Podcast with Matt Turck

AI Video’s Wild Year – Runway CEO on What’s Next

2025 has been a breakthrough year for AI video. In this episode of the MAD Podcast, Matt Turck sits down with Cristóbal Valenzuela, CEO & Co-Founder of Runway, to explore how AI is reshaping the future of filmmaking, advertising, and storytelling - faster, cheaper, and in ways that were unimaginable even a year ago.

Cris and Matt discuss:

* How AI went from memes and spaghetti clips to IMAX film festivals.

* Why Gen-4 and Aleph are game-changing models for professionals.

* How Hollywood, advertisers, and creators are adopting AI video at scale.

* The future of storytelling: what happens to human taste, craft, and creativity when anyone can conjure movies on demand?

* Runway’s journey from 2018 skeptics to today’s cutting-edge research lab.

If you want to understand the future of filmmaking, media, and creativity in the AI age, this is the episode.

Runway

Website - https://runwayml.com

X/Twitter - https://x.com/runwayml

Cristóbal Valenzuela

LinkedIn - https://www.linkedin.com/in/cvalenzuelab

X/Twitter - https://x.com/c_valenzuelab

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro – AI Video's Wild Year

(01:48) Runway's AI Film Festival Goes from Chinatown to IMAX

(04:02) Hollywood's Shift: From Ignoring AI to Adopting It at Scale

(06:38) How Runway Saves VFX Artists' Weekends of Work

(07:31) Inside Gen-4 and Aleph: Why These Models Are Game-Changers

(08:21) From Editing Tools to a "New Kind of Camera"

(10:00) Beyond Film: Gaming, Architecture, E-Commerce & Robotics Use Cases

(10:55) Why Advertising Is Adopting AI Video Faster Than Anyone Else

(11:38) How Creatives Adapt When Iteration Becomes Real-Time

(14:12) What Makes Someone Great at AI Video (Hint: No Preconceptions)

(15:28) The Early Days: Building Runway Before Generative AI Was "Real"

(20:27) Finding Early Product-Market Fit

(21:51) Balancing Research and Product Inside Runway

(24:23) Comparing Aleph vs. Gen-4, and the Future of Generalist Models

(30:36) New Input Modalities: Editing with Video + Annotations, Not Just Text

(33:46) Managing Expectations: Twitter Demos vs. Real Creative Work

(47:09) The Future: Real-Time AI Video and Fully Explorable 3D Worlds

(52:02) Runway's Business Model: From Indie Creators to Disney & Lionsgate

(57:26) Competing with the Big Labs (Sora, Google, etc.)

(59:58) Hyper-Personalized Content? Why It May Not Replace Film

(01:01:13) Advice to Founders: Treat Your Company Like a Model — Always Learning

(01:03:06) The Next 5 Years of Runway: Changing Creativity Forever

1 month ago

1 hour 4 minutes 57 seconds

The MAD Podcast with Matt Turck

How to Build a Beloved AI Product - Granola CEO Chris Pedregal

Granola is the rare AI startup that slipped into one of tech’s most crowded niches — meeting notes — and still managed to become the product founders and VCs rave about. In this episode, MAD Podcast host Matt Turck sits down with Granola co-founder & CEO Chris Pedregal to unpack how a two-person team in London turned a simple “second brain” idea into Silicon Valley’s favorite AI tool. Chris recounts a year in stealth onboarding users one by one, the 50 % feature-cut that unlocked simplicity, and why they refused to deploy a meeting bot or store audio even when investors said they were crazy.

We go deep on the craft of building a beloved AI product: choosing meetings (not email) as the data wedge, designing calendar-triggered habit loops, and obsessing over privacy so users trust the tool enough to outsource memory. Chris opens the hood on Granola’s tech stack — real-time ASR from Deepgram & Assembly, echo cancellation on-device, and dynamic routing across OpenAI, Anthropic and Google models — and explains why transcription, not LLM tokens, is the biggest cost driver today. He also reveals how internal eval tooling lets the team swap models overnight without breaking the “Granola voice.”

Looking ahead, Chris shares a roadmap that moves beyond notes toward a true “tool for thought”: cross-meeting insights in seconds, dynamic documents that update themselves, and eventually an AI coach that flags blind spots in your work. Whether you’re an engineer, designer, or founder figuring out your own AI strategy, this conversation is a masterclass in nailing product-market fit, trimming complexity, and future-proofing for the rapid advances still to come. Hit play, like, and subscribe if you’re ready to learn how to build AI products people can’t live without.

Granola

Website - https://www.granola.ai

X/Twitter - https://x.com/meetgranola

Chris Pedregal

LinkedIn - https://www.linkedin.com/in/pedregal

X/Twitter - https://x.com/cjpedregal

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Introduction: The Granola Story

(01:41) Building a "Life-Changing" Product

(04:31) The "Second Brain" Vision

(06:28) Augmentation Philosophy (Engelbart), Tools That Shape Us

(09:02) Late to a Crowded Market: Why it Worked

(13:43) Two Product Founders, Zero ML PhDs

(16:01) London vs. SF: Building Outside the Valley

(19:51) One Year in Stealth: Learning Before Launch

(22:40) "Building For Us" & Finding First Users

(25:41) Key Design Choices: No Meeting Bot, No Stored Audio

(29:24) Simplicity is Hard: Cutting 50% of Features

(32:54) Intuition vs. Data in Making Product Decisions

(36:25) Continuous User Conversations: 4–6 Calls/Week

(38:06) Prioritizing the Future: Build for Tomorrow's Workflows

(40:17) Tech Stack Tour: Model Routing & Evals

(42:29) Context Windows, Costs & Inference Economics

(45:03) Audio Stack: Transcription, Noise Cancellation & Diarization Limits

(48:27) Guardrails & Citations: Building Trust in AI

(50:00) Growth Loops Without Virality Hacks

(54:54) Enterprise Compliance, Data Footprint & Liability Risk

(57:07) Retention & Habit Formation: The "500 Millisecond Window"

(58:43) Competing with OpenAI and Legacy Suites

(01:01:27) The Future: Deep Research Across Meetings & Roadmap

(01:04:41) Granola as Career Coach?

2 months ago

1 hour 8 minutes 28 seconds

The MAD Podcast with Matt Turck

Anthropic's Surprise Hit: How Claude Code Became an AI Coding Powerhouse

What happens when an internal hack turns into a $400 million AI rocket ship? In this episode, Matt Turck sits down with Boris Cherny, the creator of Claude Code at Anthropic, to unpack the wild story behind the fastest-growing AI coding tool on the planet.

Boris reveals how Claude Code started as a personal productivity tool, only to become Anthropic’s secret weapon — now used by nearly every engineer at the company and rapidly spreading across the industry. You’ll hear how Claude Code’s “agentic” approach lets AI not just suggest code, but actually plan, edit, debug, and even manage entire projects—sometimes with a whole fleet of subagents working in parallel.

We go deep on why Claude Code runs in the terminal (and why that’s a feature, not a bug), how its Claude.md memory files let teams build a living, shareable knowledge base, and why safety and human-in-the-loop controls are baked into every action. Boris shares real stories of onboarding times dropping from weeks to days, and how even non-coders are hacking Cloud Code for everything from note-taking to business metrics.

Anthropic

Website - https://www.anthropic.com

X/Twitter - https://x.com/AnthropicAI

Boris Cherny

LinkedIn - https://www.linkedin.com/in/bcherny

X/Twitter - https://x.com/bcherny

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:15) Did You Expect Claude Code’s Success?

(04:22) How Claude Code Works and Origins

(08:05) Command Line vs IDE: Why Start Claude Code in the Terminal?

(11:31) The Evolution of Programming: From Punch Cards to Agents

(13:20) Product Follows Model: Simple Interfaces and Fast Evolution

(15:17) Who Is Claude Code For? (Engineers, Designers, PMs & More)

(17:46) What Can Claude Code Actually Do? (Actions & Capabilities)

(21:14) Agentic Actions, Subagents, and Workflows

(25:30) Claude Code’s Awareness, Memory, and Knowledge Sharing

(33:28) Model Context Protocol (MCP) and Customization

(35:30) Safety, Human Oversight, and Enterprise Considerations

(38:10) UX/UI: Making Claude Code Useful and Enjoyable

(40:44) Pricing for Power Users and Subscription Models

(43:36) Real-World Use Cases: Debugging, Testing, and More

(46:44) How Does Claude Code Transform Onboarding?

(49:36) The Future of Coding: Agents, Teams, and Collaboration

(54:11) The AI Coding Wars: Competition & Ecosystem

(57:27) The Future of Coding as a Profession

(58:41) What’s Next for Claude Code

2 months ago

1 hour 16 seconds

The MAD Podcast with Matt Turck

Ex‑DeepMind Researcher Misha Laskin on Enterprise Super‑Intelligence | Reflection AI

What if your company had a digital brain that never forgot, always knew the answer, and could instantly tap the knowledge of your best engineers, even after they left? Superintelligence can feel like a hand‑wavy pipe‑dream— yet, as Misha Laskin argues, it becomes a tractable engineering problem once you scope it to the enterprise level. Former DeepMind researcher Laskin is betting on an oracle‑like AI that grasps every repo, Jira ticket and hallway aside as deeply as your principal engineer—and he’s building it at Reflection AI.

In this wide‑ranging conversation, Misha explains why coding is the fastest on‑ramp to superintelligence, how “organizational” beats “general” when real work is on the line, and why today’s retrieval‑augmented generation (RAG) feels like “exploring a jungle with a flashlight.” He walks us through Asimov, Reflection’s newly unveiled code‑research agent that fuses long‑context search, team‑wide memory and multi‑agent planning so developers spend less time spelunking for context and more time shipping.

We also rewind his unlikely journey—from physics prodigy in a Manhattan‑Project desert town, to Berkeley’s AI crucible, to leading RLHF for Google Gemini—before he left big‑lab comfort to chase a sharper vision of enterprise super‑intelligence. Along the way: the four breakthroughs that unlocked modern AI, why capital efficiency still matters in the GPU arms‑race, and how small teams can lure top talent away from nine‑figure offers.

If you’re curious about the next phase of AI agents, the future of developer tooling, or the gritty realities of scaling a frontier‑level startup—this episode is your blueprint.

Reflection AI

Website - https://reflection.ai

LinkedIn - https://www.linkedin.com/company/reflectionai

Misha Laskin

LinkedIn - https://www.linkedin.com/in/mishalaskin

X/Twitter - https://x.com/mishalaskin

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:42) Reflection AI: Company Origins and Mission

(04:14) Making Superintelligence Concrete

(06:04) Superintelligence vs. AGI: Why the Goalposts Moved

(07:55) Organizational Superintelligence as an Oracle

(12:05) Coding as the Shortcut: Hands, Legs & Brain for AI

(16:00) Building the Context Engine

(20:55) Capturing Tribal Knowledge in Organizations

(26:31) Introducing Asimov: A Deep Code Research Agent

(28:44) Team-Wide Memory: Preserving Institutional Knowledge

(33:07) Multi-Agent Design for Deep Code Understanding

(34:48) Data Retrieval and Integration in Asimov

(38:13) Enterprise-Ready: VPC and On-Prem Deployments

(39:41) Reinforcement Learning in Asimov's Development

(41:04) Misha's Journey: From Physics to AI

(42:06) Growing Up in a Science-Driven Desert Town

(53:03) Building General Agents at DeepMind

(56:57) Founding Reflection AI After DeepMind

(58:54) Product-Driven Superintelligence: Why It Matters

(01:02:22) The State of Autonomous Coding Agents

(01:04:26) What's Next for Reflection AI

3 months ago

1 hour 6 minutes 29 seconds

The MAD Podcast with Matt Turck

The Rise of Agentic Commerce — Emily Glassberg Sands (Stripe)

Agentic commerce is no longer science fiction — it’s arriving in your browser, your development IDE, and soon, your bank statement. In this episode of The MAD Podcast, Matt Turck sits down with Emily Glassberg Sands, Stripe’s Head of Information, to explore how autonomous “buying bots” and the Model Context Protocol (MCP) are reshaping the very mechanics of online transactions. Emily explains why intent, not clicks, will become the primary interface for shopping and how Stripe’s rails are adapting for tokens, one-time virtual cards, and real-time risk scoring that can tell good bots from bad ones in milliseconds.

We also go deep into Stripe's strategic AI choices. Drawing on $1.4 trillion in annual payment flow—1.3 percent of global GDP—Stripe decided to train its own payments foundation model, turning tens of billions of historical charges into embeddings that boost fraud-catch recall from 59 percent to 97 percent. Emily walks us through the tech: why they chose a BERT encoder over GPT-style decoders, how three MLEs in a “research bubble” birthed the model, and what it takes to run it in production with five-nines reliability and tight latency budgets.

We zoom out to Stripe’s unique vantage point on the broader AI economy. Their data shows the top AI startups hitting $30 million in ARR three times faster than the fastest SaaS companies did a decade ago, with more than half of that revenue already coming from overseas markets. Emily unpacks the new billing playbook—usage-based pricing today, outcome-based pricing tomorrow—and explains why tiny teams of 20–30 people can now build global, vertically focused AI businesses almost overnight.

Stripe

Website - https://stripe.com

X/Twitter - https://x.com/stripe?

Emily Glassberg Sands

LinkedIn - https://www.linkedin.com/in/egsands

X/Twitter - https://x.com/emilygsands

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:45) How Big Is Stripe? Latest Stats Revealed

(04:06) What Does “Head of Information” at Stripe Actually Do?

(05:43) From Harvard to Stripe: Emily’s Unusual Journey

(08:54) Why Stripe Built Its Own Foundation Model

(13:19) Cracking the Code: How Stripe Handles Complex Payment Data

(16:25) Foundation Model vs. Traditional ML: What’s Winning?

(20:09) Inside Stripe’s Foundation Model: How It Was Built

(24:35) How Stripe Makes AI Decisions Transparent

(28:38) Where Stripe Uses AI (And Where It Doesn’t)

(34:10) How Stripe’s AI Drives Revenue for Businesses

(41:22) Real-Time Fraud Detection: Stripe’s Secret Sauce

(42:51) The Future of Shopping: AI Agents & Agentic Commerce

(46:20) How Agentic Commerce Is Changing Stripe

(49:36) Stripe’s Vision for a World of AI-Powered Buyers

(55:46) What Is MCP? Stripe’s Take on Agent-to-Agent Protocols

(59:31) Stripe’s Data on AI Startups Monetizing 3× Faster

(01:03:03) How AI Companies Go Global — From Day One

(01:07:48) The New Rules: Billing & Pricing for AI Startups

(01:10:57) How Stripe Builds AI Literacy Across the Company

(01:14:05) Roadmap: Risk-as-a-Service, Order Intent, and Beyond

3 months ago

1 hour 15 minutes 14 seconds

The MAD Podcast with Matt Turck

AI Engineering Revolution: Winners, Chaos & What’s Next | FirstMark

Welcome to a special FirstMark Deep Dive edition of the MAD Podcast. In this episode, Matt Turck and David Waltcher unpack the explosive impact of generative AI on engineering — hands-down the biggest shift the field has seen in decades. You’ll get a front-row seat to the real numbers and stories behind the AI code revolution, including how companies like Cursor hit a $500M valuation in record time, and why GitHub Copilot now serves 15 million developers.

Matt and David break down the six trends that shaped the last 20 years of developer tools, and reveal why coding is the #1 use case for generative AI (hint: it’s all about public data, structure, and ROI). You’ll hear how AI is making engineering teams 30-50% faster, but also why this speed is breaking traditional DevOps, overwhelming QA, and turning top engineers into full-time code reviewers.

We get specific: 82% of engineers are already using AI to write code, but this surge is creating new security vulnerabilities, reliability issues, and a total rethink of team roles. You’ll learn why code review and prompt engineering are now the most valuable skills, and why computer science grads are suddenly facing some of the highest unemployment rates.

We also draw wild historical parallels—from the Gutenberg Press to the Ford assembly line—to show how every productivity boom creates new problems and entire industries to solve them. Plus: what CTOs need to know about hiring, governance, and architecture in the AI era, and why being “AI native” can make a startup more credible than a 10-year-old giant.

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

David Waltcher

LinkedIn - https://www.linkedin.com/in/davidwaltcher

X/Twitter - https://x.com/davidwaltcher

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

(00:00) Intro & episode setup

(01:50) The 6 waves that led to GenAI engineering

(04:30) Why coding is such fertile ground for Generative AI

(08:25) Break-out dev-tool winners: Cursor, Copilot, Replit, V0

(11:25) Early stats: Teams Are Shipping Code Faster with AI

(13:32) Copilots vs Autonomous Agents: The Current Reality

(14:14) Lessons from History: Every Tech Boom Creates New Problems

(21:53) FirstMark Survey: The Headaches AI Is Creating for Developers

(22:53) What’s Now Breaking: Security, CI/CD flakes, QA Overload

(29:16) The New CTO Playbook to Adapt to the AI Revolution

(33:23) What Happens to Engineering Orgs if Everyone is a Coder?

(40:19) Founder opportunities & the dev-tool halo effect

(44:24) The Built-in Credibility of AI-Native Startups

(46:16) The Irony of Dev Tools As Biggest Winners in the AI Gold Rush

(47:43) What’s Next for AI and Engineering?

4 months ago

49 minutes 53 seconds

The MAD Podcast with Matt Turck

Guillermo Rauch: Why Software Development Will Never Be the Same

In this episode, Vercel CEO Guillermo Rauch goes deep on how V0, their text-to-app platform, has already generated over 100 million applications and doubled Vercel’s user base in under a year.

Guillermo reveals how a tiny SWAT team inside Vercel built V0 from scratch, why “vibe coding” is making software creation accessible to everyone (not just engineers), and how the AI Cloud is automating DevOps, making cloud infrastructure self-healing, and letting companies expose their data to AI agents in just five lines of code.

You’ll hear why “every company will have to rethink itself as a token factory,” how Vercel’s Next.js went from a conference joke to powering Walmart, Nike, and Midjourney, and why the next billion app creators might not write a single line of code. Guillermo breaks down the difference between vibe coding and agentic engineering, shares wild stories of users building apps from napkin sketches, and explains how Vercel is infusing “taste” and best practices directly into their AI models.

We also dig into the business side: how Vercel’s AI-powered products are driving explosive growth, why retention and margins are strong, and how the company is adapting to a new wave of non-technical users. Plus: the future of MCP servers, the security challenges of agent-to-agent communication, and why prompting and AI literacy are now must-have skills.

Vercel

Website - https://vercel.com

X/Twitter - https://x.com/vercel

Guillermo Rauch

LinkedIn - https://www.linkedin.com/in/rauchg

X/Twitter - https://x.com/rauchg

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(02:08) What Is V0 and Why Did It Take Off So Fast?

(04:10) How Did a Tiny Team Build V0 So Quickly?

(07:51) V0 vs Other AI Coding Tools

(10:35) What is Vibe Coding?

(17:05) Is V0 Just Frontend? Moving Toward Full Stack and Integrations

(19:40) What Skills Make a Great Vibe Coder?

(23:35) Vibe Coding as the GUI for AI: The Future of Interfaces

(29:46) Developer Love = Agent Love

(33:41) Having Taste as Developer

(39:10) MCP Servers: The New Protocol for AI-to-AI Communication

(43:11) Security, Observability, and the Risks of Agentic Web

(45:25) Are Enterprises Ready for the Agentic Future?

(49:42) Closing the Feedback Loop: Customer Service and Product Evolution

(56:06) The Vercel AI Cloud: From Pixels to Tokens

(01:10:14) How Vercel Adapts to the ICP Change?

(01:13:47) Retention, Margins, and the Business of AI Products

(01:16:51) The Secret Behind Vercel Last Year Growth

(01:24:15) The Importance of Online Presence

(01:30:49) Everything, Everywhere, All at Once: Being CEO 101

(01:34:59) Guillermo's Advice to Younger Self

4 months ago

1 hour 45 minutes 40 seconds

The MAD Podcast with Matt Turck

Inside Canva’s $3B ARR AI Design Rocketship — CTO Brendan Humphreys on Magic Studio & Canva Code

Canva just announced $3 billion in ARR, 230 million monthly active users, and 24 million paying subscribers—including 95% of the Fortune 500. Even more impressive? They’ve been profitable for seven years while growing at 40–50% per year.

In this episode, Canva’s Head of Engineering, Brendan Humphreys, reveals how he went from employee #12 to leading 2,300 engineers across continents, and why Canva’s “pragmatic excellence” lets them ship AI features at breakneck speed—like launching Canva Code to 100 million users in just three months.

Brendan shares the story of Canva’s AI journey: building an in-house ML team back in 2017, acquiring visual AI startups like Kaleido and Leonardo AI, and why they use a hybrid of OpenAI, Anthropic, Google, and their own foundation models. He explains how Canva’s App Store gives niche AI startups instant access to millions, and why their $200M Creator Fund is designed to reward contributors in the AI era. You’ll also hear how AI tools like Copilot are making Canva’s senior engineers 30% more productive, why “vibe coding” isn’t ready for prime time, and the unique challenges of onboarding junior engineers in an AI-driven world.

We also dig into Canva’s approach to technical debt, scaling from 12 to 5,000 employees, and why empathy is a core engineering skill at Canva.

Canva

Website - https://www.canva.com

X/Twitter - https://x.com/canva

Brendan Humphreys

LinkedIn - https://www.linkedin.com/in/brendanhumphreys

X/Twitter - https://x.com/brendanh

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:14) Canva’s Mind-Blowing Growth and Profitable Journey

(03:41) Why Brendan Left Atlassian to Join a Tiny Startup

(06:17) What Being a Founder Taught Brendan About Leadership

(07:24) Growing with Canva: From 12 Employees to 2,300 Engineers

(10:02) How Canva Runs a Global Team from Sydney to Europe

(13:16) Is AI a Threat or a Superpower for Canva?

(15:22) The Real Story Behind Canva’s AI and Machine Learning Team

(17:23) How Canva Ships New AI Features So Fast

(19:19) A Tour of Canva’s Latest AI-Powered Products

(21:03) From Design Tool to All-in-One Productivity Platform

(26:21) Keeping Up the Pace: How Canva Moves So Quickly

(30:22) The Future: AI Agents, Copilots, and Smarter Workflows

(33:14) How AI Tools Are Changing the Way Engineers Work

(35:47) Rethinking Hiring and Training in the Age of AI

(37:01) Why Empathy Matters in Engineering at Canva

(39:41) Building vs. Buying: How Canva Chooses Its AI Tech

(41:23) Lessons Learned: Technical Debt and Scaling Pains

(51:18) Shipping Fast Without Breaking Things

(53:08) What’s Next: AI Video, New Features, and Big Ambitions

4 months ago

56 minutes 38 seconds

The MAD Podcast with Matt Turck

GitHub CEO: The AI Coding Gold Rush, Vibe Coding & Cursor

AI coding is in full-blown gold-rush mode, and GitHub sits at the epicenter. In this episode, GitHub CEO Thomas Dohmke tells Matt Turck how a $7.5 B acquisition in 2018 became a $2 B ARR rocket ship, and reveals how Copilot was born from a secret AI strategy years before anyone else saw the opportunity.

We dig into the dizzying pace of AI innovation: why developer tools are suddenly the fastest-growing startups in history, how GitHub’s multi-model approach (OpenAI, Anthropic Claude 4, Gemini 2.5, and even local LLMs) gives you more choice and speed, and why fine-tuning models might be overrated. Thomas explains how Copilot keeps you in the “magic flow state,” how even middle schoolers are using it to hack Minecraft.

The conversation then zooms out to the competitive battlefield: Cursor’s $10 B valuation, Mistral’s new code model, and a wave of AI-native IDE forks vying for developer mind-share. We discuss why 2025’s “coding agents” could soon handle 90 % of the world’s code, the survival of SaaS and why the future of coding is about managing agents, not just writing code.

GitHub

Website - https://github.com/

X/Twitter - https://x.com/github

Thomas Dohmke

LinkedIn - https://www.linkedin.com/in/ashtom

X/Twitter - https://twitter.com/ashtom

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:50) Why AI Coding Is Ground Zero for Generative AI

(02:40) The $7.5B GitHub Acquisition: Microsoft’s Strategic Play

(06:21) GitHub’s Role in the Azure Cloud Ecosystem

(10:25) How GitHub Copilot Beat Everyone to Market

(16:09) Copilot & VS Code Explained for Non-Developers

(21:02) GitHub Models: Multi-Model Choice and What It Means

(25:31) The Reality of Fine-Tuning AI Models for Enterprise

(29:13) The Dizzying Pace and Political Economy of AI Coding Tools

(36:58) Competing and Partnering: Microsoft’s Unique AI Strategy

(41:29) Does Microsoft Limit Copilot’s AI-Native Potential?

(46:44) The Bull and Bear Case for AI-Native IDEs Like Cursor

(52:09) Agent Mode: The Next Step for AI-Powered Coding

(01:00:10) How AI Coding Will Change SaaS and Developer Skills

4 months ago

1 hour 4 minutes 46 seconds

The MAD Podcast with Matt Turck

Inside the Paper That Changed AI Forever - Cohere CEO Aidan Gomez on 2025 Agents

What really happened inside Google Brain when the “Attention is All You Need” paper was born? In this episode, Aidan Gomez — one of the eight co-authors of the Transformers paper and now CEO of Cohere — reveals the behind-the-scenes story of how a cold email and a lucky administrative mistake landed him at the center of the AI revolution.

Aidan shares how a group of researchers, given total academic freedom, accidentally stumbled into one of the most important breakthroughs in AI history — and why the architecture they created still powers everything from ChatGPT to Google Search today.

We dig into why synthetic data is now the secret sauce behind the world’s best AI models, and how Cohere is using it to build enterprise AI that’s more secure, private, and customizable than anything else on the market. Aidan explains why he’s not interested in “building God” or chasing AGI hype, and why he believes the real impact of AI will be in making work more productive, not replacing humans.

You’ll also get a candid look at the realities of building an AI company for the enterprise: from deploying models on-prem and air-gapped for banks and telecoms, to the surprising demand for multimodal and multilingual AI in Japan and Korea, to the practical challenges of helping customers identify and execute on hundreds of use cases.

Cohere

Website - https://cohere.com

X/Twitter - https://x.com/cohere

Aidan Gomez

LinkedIn - https://ca.linkedin.com/in/aidangomez

X/Twitter - https://x.com/aidangomez

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(02:00) The Story Behind the Transformers Paper

(03:09) How a Cold Email Landed Aidan at Google Brain

(10:39) The Initial Reception to the Transformers Breakthrough

(11:13) Google’s Response to the Transformer Architecture

(12:16) The Staying Power of Transformers in AI

(13:55) Emerging Alternatives to Transformer Architectures

(15:45) The Significance of Reasoning in Modern AI

(18:09) The Untapped Potential of Reasoning Models

(24:04) Aidan’s Path After the Transformers Paper and the Founding of Cohere

(25:16) Choosing Enterprise AI Over AGI Labs

(26:55) Aidan’s Perspective on AGI and Superintelligence

(28:37) The Trajectory Toward Human-Level AI

(30:58) Transitioning from Researcher to CEO

(33:27) Cohere’s Product and Platform Architecture

(37:16) The Role of Synthetic Data in AI

(39:32) Custom vs. General AI Models at Cohere

(42:23) The AYA Models and Cohere Labs Explained

(44:11) Enterprise Demand for Multimodal AI

(49:20) On-Prem vs. Cloud

(50:31) Cohere’s North Platform

(54:25) How Enterprises Identify and Implement AI Use Cases

(57:49) The Competitive Edge of Early AI Adoption

(01:00:08) Aidan’s Concerns About AI and Society

(01:01:30) Cohere’s Vision for Success in the Next 3–5 Years

5 months ago

1 hour 2 minutes 24 seconds

The MAD Podcast with Matt Turck

AI That Ends Busy Work — Hebbia CEO on “Agent Employees”

What if the smartest people in finance and law never had to do “stupid tasks” again? In this episode, we sit down with George Sivulka, founder of Hebbia, the AI company quietly powering 50% of the world’s largest asset managers and some of the fastest-growing law firms.

George reveals how Hebbia’s Matrix platform is automating the equivalent of 50,000 years of human reading — every year — and why the future of work is hybrid teams of humans and AI “agent employees.” You’ll get the inside story on how Hebbia went from a stealth project at Stanford to a multinational company trusted by the Department of Defense, and why their spreadsheet-inspired interface is leaving chatbots in the dust. George breaks down the technical secrets behind Hebbia’s ISD architecture (and why they killed RAG), how they process billions of pages with near-zero hallucinations, and what it really takes to sell AI into the world’s most regulated industries.

We also dive into the future of organizational design, why generalization beats specialization in AI, and how “prompting is the new management skill.” Plus: the real story behind AI hallucinations, the myth of job loss, and why naiveté might be the ultimate founder superpower.

Hebbia

Website - https://www.hebbia.com

Twitter - https://x.com/HebbiaAI

George Sivulka

LinkedIn - https://www.linkedin.com/in/sivulka

Twitter - https://x.com/gsivulka

FIRSTMARK

Website - https://firstmark.com

Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:46) What is Hebbia

(02:49) Evolving Hebbia’s mission

(04:45) The founding story and Stanford's inspiration

(09:45) The rise of agent employees and AI in organizations

(12:36) The future of AI-powered work

(15:17) AI research trends

(19:49) Inside Matrix: Hebbia’s flagship AI platform

(24:02) Why Hebbia isn’t just another chatbot

(28:27) Moving beyond RAG: Hebbia’s unique architecture

(34:10) Tackling hallucinations in high-stakes AI

(35:59) Research culture and avoiding industry groupthink

(39:40) Innovating go-to-market and enterprise sales

(41:57) Real-world value: Cost savings and new revenue

(43:49) How AI is changing junior roles

(45:55) Leadership and perspective as a young founder

(47:16) Hebbia’s roadmap: Success in the next 3 years

5 months ago

48 minutes 24 seconds

The MAD Podcast with Matt Turck

AI Eats the World: Benedict Evans on What Really Matters Now

What if the “AI revolution” is actually… stuck in the messy middle? In this episode, Benedict Evans returns to tackle the big question we left hanging a year ago: Is AI a true paradigm shift, or just another tech platform shift like mobile or cloud? One year later, the answer is more complicated — and more revealing — than anyone expected.

Benedict pulls back the curtain on why, despite all the hype and model upgrades, the core LLMs are starting to look like commodities. We dig into the real battlegrounds: distribution, brand, and the race to build sticky applications. Why is ChatGPT still topping the App Store charts while Perplexity and Claude barely register outside Silicon Valley? Why did OpenAI just hire a CEO of Applications, and what does that signal about the future of AI products?

We go deep on the “probabilistic” nature of LLMs, why error rates are still the elephant in the room, the future of consumer AI (is there a killer app beyond chatbots and image generators?), the impact of generative content on e-commerce and advertising, and whether “AI agents” are the next big thing — or just another overhyped demo.

And, we ask: What happened to AI doomerism? Why did the existential risk debate suddenly vanish, and what risks should we actually care about?

Benedict Evans

LinkedIn - https://www.linkedin.com/in/benedictevans

Threads - https://www.threads.net/@benedictevans

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:47) Is AI a Platform Shift or a Paradigm Shift?

(07:21) Error Rates and Trust in AI

(15:07) Adapting to AI’s Capabilities

(19:18) Generational Shifts in AI Usage

(22:10) The Commoditization of AI Models

(27:02) Are Brand and Distribution the Real Moats in AI?

(29:38) OpenAI: Research Lab or Application Company?

(33:26) Big Tech’s AI Strategies: Apple, Google, Meta, AWS

(39:00) AI and Search: Is ChatGPT a Search Engine?

(42:41) Consumer AI Apps: Where’s the Breakout?

(45:51) The Need for a GUI for AI

(48:38) Generative AI in Social and Content

(51:02) The Business Model of AI: Ads, Memory, and Moats

(55:26) Enterprise AI: SaaS, Pilots, and Adoption

(01:00:08) The Future of AI in Business

(01:05:11) Infinite Content, Infinite SKUs: AI and E-commerce

(01:09:42) Doomerism, Risks, and the Future of AI

5 months ago

1 hour 15 minutes 9 seconds

The MAD Podcast with Matt Turck

Jeremy Howard on Building 5,000 AI Products with 14 People (Answer AI Deep-Dive)

What happens when you try to build the “General Electric of AI” with just 14 people? In this episode, Jeremy Howard reveals the radical inside story of Answer AI — a new kind of AI R&D lab that’s not chasing AGI, but instead aims to ship thousands of real-world products, all while staying tiny, open, and mission-driven.

Jeremy shares how open-source models like DeepSeek and Qwen are quietly outpacing closed-source giants, why the best new AI is coming out of China. You’ll hear the surprising truth about the so-called “DeepSeek moment,” why efficiency and cost are the real battlegrounds in AI, and how Answer AI’s “dialogue engineering” approach is already changing lives—sometimes literally.

We go deep on the tools and systems powering Answer AI’s insane product velocity, including Solve It (the platform that’s helped users land jobs and launch startups), Shell Sage (AI in your terminal), and Fast HTML (a new way to build web apps in pure Python). Jeremy also opens up about his unconventional path from philosophy major and computer game enthusiast to world-class AI scientist, and why he believes the future belongs to small, nimble teams who build for societal benefit, not just profit.

Fast.ai

Website - https://www.fast.ai

X/Twitter - https://twitter.com/fastdotai

Answer.ai

Website - https://www.answer.ai/

X/Twitter - https://x.com/answerdotai

Jeremy Howard

LinkedIn - https://linkedin.com/in/howardjeremy

X/Twitter - https://x.com/jeremyphoward

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:39) Highlights and takeaways from ICLR Singapore

(02:39) Current state of open-source AI

(03:45) Thoughts on Microsoft Phi and open source moves

(05:41) Responding to OpenAI’s open source announcements

(06:29) The real impact of the Deepseek ‘moment’

(09:02) Progress and promise in test-time compute

(10:53) Where we really stand on AGI and ASI

(15:05) Jeremy’s journey from philosophy to AI

(20:07) Becoming a Kaggle champion and starting Fast.ai

(23:04) Answer.ai mission and unique vision

(28:15) Answer.ai’s business model and early monetization

(29:33) How a small team at Answer.ai ships so fast

(30:25) Why Devin AI agent isn't that great

(33:10) The future of autonomous agents in AI development

(34:43) Dialogue Engineering and Solve It

(43:54) How Answer.ai decides which projects to build

(49:47) Future of Answer.ai: staying small while scaling impact

5 months ago

55 minutes 2 seconds

The MAD Podcast with Matt Turck

Why Influx Rebuilt Its Database for the IoT and Robotics Explosion

InfluxDB just dropped its biggest update ever — InfluxDB 3.0 — and in this episode, we go deep with the team behind the world’s most popular open-source time series database.

You’ll hear the inside story of how InfluxDB grew from 3,000 users in 2015 to over 1.3 million today, and why the company decided to rewrite its entire architecture from scratch in Rust, ditching Go and moving to object storage on S3.

We break down the real technical challenges that forced this radical shift: the “cardinality problem” that choked performance, the pain of linking compute and storage, and why their custom query language (Flux) failed to catch on, leading to a humbling embrace of SQL as the industry standard. You’ll learn how InfluxDB is positioning itself in a world dominated by Databricks and Snowflake, and the hard lessons learned about monetization when 1.3 million users only yield 2,600 paying customers.

InfluxData

Website - https://www.influxdata.com

X/Twitter - https://twitter.com/InfluxDB

Evan Kaplan

LinkedIn - https://www.linkedin.com/in/kaplanevan

X/Twitter - https://x.com/evankaplan

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

Foursquare:

Website - https://foursquare.com

X/Twitter - https://x.com/Foursquare

IG - instagram.com/foursquare

(00:00) Intro

(02:22) The InfluxDB origin story and why time series matters

(06:59) The cardinality crisis and why Influx rebuilt in Rust

(09:26) Why SQL won (and Flux lost)

(16:34) Why UnfluxData bets on FDAP

(22:51) IoT, Tesla Powerwalls, and real-time control systems

(27:54) Competing with Databricks, Snowflake, and the “lakehouse” world

(31:50) Open Source lessons, monetization, & what’s next

5 months ago

35 minutes 35 seconds

The MAD Podcast with Matt Turck

Dashboards Are Dead: Sigma’s BI Revolution for Trillion-Row Data

Sigma Computing recently hit $100M in ARR — planning on doubling revenue again this year— and in this episode, CEO Mike Palmer reveals exactly how they did it by throwing out the old BI playbook. We open with the provocative claim that “the world did not need another BI tool,” and dig into why the last 20 years of business intelligence have been “boring.” He explains how Sigma’s spreadsheet-like interface lets anyone analyze billions of rows in seconds, and lives on top of Snowflake and Databricks, with no SQL required and no data extractions.

Mike shares the inside story of Sigma’s journey: why they shut down their original product to rebuild from scratch, how Sutter Hill Ventures’ unique incubation model shaped the company, what it took to go from $2M to $100M ARR in just three years and raise a $200M round — even as the growth stage VC market dried up. We get into the technical details behind Sigma’s architecture: no caching, no federated queries, and real-time, Google Sheets-style collaboration at massive scale—features that have convinced giants like JP Morgan and ExxonMobil to ditch legacy dashboards for good.

We also tackle the future of BI and the modern data stack: why 99.99% of enterprise data is never touched, what’s about to happen as the stack consolidates, and why Mike thinks “text-to-SQL” AI is a “terrible idea.”

This episode is full of "spicey takes" - Mike shares his thoughts on how Google missed the zeitgeist, the reality behind Microsoft Fabric, when engineering hubris leads to failure, and many more.

Sigma

Website - https://www.sigmacomputing.com

X/Twitter - https://x.com/sigmacomputing

Mike Palmer

LinkedIn - https://www.linkedin.com/in/mike-palmer-51a154

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

Foursquare:

Website - https://foursquare.com

X/Twitter - https://x.com/Foursquare

IG - instagram.com/foursquare

(00:00) Intro

(01:46) Why traditional BI is boring

(04:15) What is business intelligence?

(06:03) Classic BI roles and frustrations

(07:09) Sigma’s origin story: Sutter Hill & the Snowflake echo

(09:02) The spreadsheet problem: why nothing changed since 1985

(14:04) Rebooting the product during lockdown

(16:14) Building a spreadsheet UX on top of Snowflake/Databricks

(18:55) No caching, no federation: Sigma’s architectural choices

(20:28) Spreadsheet interface at scale

(21:32) Collaboration and real-time data workflows

(24:15) Semantic layers, data governance & trillion-row performance

(25:57) The modern data stack: fragmentation and consolidation

(28:38) Democratizing data

(29:36) Will hyperscalers own the data stack?

(34:12) AI, natural language, and the limits of text-to-SQL

6 months ago

41 minutes 32 seconds