From Model Weights to Agent Workflows: Charting the New Frontier of Optimization in Large Language Models

https://is1-ssl.mzstatic.com/image/thumb/Podcasts211/v4/f2/56/51/f256516c-7ca0-a1e0-095d-98b42a505a34/mza_2950839120930297173.jpg/600x600bb.jpg

Best AI papers explained

Enoch H. Kang

424 episodes

3 days ago

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Technology

RSS

All content for Best AI papers explained is the property of Enoch H. Kang and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Cut through the noise. We curate and break down the most important AI papers so you don’t have to.

Technology

https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/43252366/43252366-1744500070152-e62b760188d8.jpg

From Model Weights to Agent Workflows: Charting the New Frontier of Optimization in Large Language Models

Best AI papers explained

16 minutes 51 seconds

1 week ago

From Model Weights to Agent Workflows: Charting the New Frontier of Optimization in Large Language Models

We discusse a significant shift in artificial intelligence, moving from optimizing single, monolithic **Large Language Models (LLMs)** to optimizing complex, multi-component **LLM agents**. Previously, optimization focused on tuning model **weights ($\theta$)** using methods like **Reinforcement Learning from Human Feedback (RLHF)**, which relied on a clear mathematical objective including **KL-regularized expected reward**. However, the emerging paradigm of agent optimization involves tuning an entire **workflow program ($\Pi$)**, which includes textual prompts, tool usage, and control flow logic. This creates a challenging, **non-differentiable** and **combinatorial optimization space** that lacks a clear mathematical objective. The text then analyzes two prominent frameworks, **DSPy** and **LLM-AutoDiff**, which attempt to bring structure to this new problem by treating it as either a **program search problem** (DSPy) or by introducing a **"calculus of prompts"** with **"textual gradients"** (LLM-AutoDiff), although the latter still relies on semantic, rather than strictly mathematical, objectives.