Decoding AI: How LLMs Truly "Think"

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/87/72/72/877272a2-4a51-ca32-cdde-307df6201a3f/mza_10804664231773405286.jpg/600x600bb.jpg

Allied Angels: Venture Capital Insights

Allied Venture Partners

45 episodes

6 days ago

Allied Angels, brought to you by Allied Venture Partners (Allied VC), explores the world of startups, angel investing, and venture capital. Join Allied Ventures' experts as they discuss investment strategies, startup success stories, and the latest trends in the Canadian and US markets. Allied Angels is created with AI using Google's NotebookLM.

Business

RSS

All content for Allied Angels: Venture Capital Insights is the property of Allied Venture Partners and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Business

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_episode/42168773/42168773-1743279409000-c897e74202ed9.jpg

Decoding AI: How LLMs Truly "Think" | Ep 36

Allied Angels: Venture Capital Insights

22 minutes 37 seconds

7 months ago

Decoding AI: How LLMs Truly "Think" | Ep 36

In this episode of Allied Angels, we unlock the inner workings of large language models (LLMs), like Claude 3.5 Haiku, by breaking down the latest cutting-edge research paper from Anthropic: Tracing the thoughts of a large language model.

Join us as we delve into mechanistic interpretability, exploring how AI truly "thinks" by revealing its computational graphs and underlying circuits.

Discover the innovative circuit tracing methodology utilizing attribution graphs and cross-layer transcoders (CLTs) to dissect the complex processes within these models.

We uncover interpretable features – the building blocks of AI computation – and map their interactions to understand how models generate text and perform tasks.

We also explore fascinating "AI biology" as we trace the pathways behind diverse behaviors, such as:

• Multilingualism: Uncover evidence of a shared conceptual space and both language-specific and language-independent circuits.

• Planning: Learn how language models plan their outputs, even in creative tasks like poetry generation, by identifying future words and working backward.

• Refusals: Understand the mechanisms behind a model's decision to decline harmful requests and how specific features contribute to this behavior.

• Jailbreaks: Investigate prompting strategies that can bypass safety mechanisms and the underlying weaknesses they exploit.

• Factual Recall: See how models access and utilize factual knowledge to answer questions.

• Addition: Delve into the surprisingly intricate circuits responsible for simple arithmetic.

• Entity Recognition and Hallucinations: Learn how models distinguish between known and unknown entities and the circuit misfires that can lead to fabricated information.

• Chain-of-thought Faithfulness: Examine whether a model's stated reasoning aligns with its actual computational steps.

• Hidden Goals: Uncover how fine-tuning can embed secret objectives within a model's persona.

Gain insights into the limitations of current methods, including missing attention circuits, reconstruction errors, and the challenges of understanding global circuits. We also discuss the crucial role of validation through perturbation experiments.

This podcast provides a unique window into the "thoughts" of large language models, revealing the fascinating interplay of features and circuits that drive their capabilities and limitations.

Tune in to explore the cutting-edge of AI interpretability and the quest to build an "AI microscope" to understand the complex world within.

------------

⁠⁠⁠Allied VC⁠⁠⁠ is Western Canada's largest angel syndicate, investing in early-stage technology startups across Canada and the USA.

Pitch us, Invest, Scout, and more: ⁠⁠⁠⁠https://linktr.ee/alliedvc⁠⁠⁠⁠

Allied Angels is powered by NotebookLM - Google's new AI note-taking & research assistant.