
This paper introduces GEPA (Genetic-Pareto), a novel prompt optimizer designed for large language models (LLMs) and compound AI systems. Unlike traditional reinforcement learning (RL) methods that rely on numerical rewards and extensive "rollouts" (tens of thousands), GEPA leverages natural language reflection to learn high-level rules from trial and error, significantly reducing the required number of rollouts. It achieves this by analyzing system trajectories, diagnosing problems, proposing prompt updates, and combining effective lessons through a Pareto frontier search. This paper presents evidence that GEPA outperforms existing RL and prompt optimization techniques in sample efficiency and generalization across various benchmarks, while also producing shorter, more efficient prompts compared to other methods like MIPROv2.