
This episode dive deep on the Thinking Machines Lab publication that addresses the challenge of achieving reproducibility in large language model (LLM) inference, noting that even with "greedy sampling" (temperature set to 0), results are often nondeterministic.