Here's the newest bright idea in AI: don’t pay humans to evaluate model outputs, let another model do it. This is the “LLM-as-a-judge” craze. Models not just spitting answers but grading them too, like a student slipping themselves the answer key. It sounds efficient, until you realize you’ve built the academic equivalent of letting someone’s cousin sit on their jury. The problem is called preference leakage. Li et al. nailed it in their paper “Preference Leakage: A Contamination Problem in LLM-as-a-Judge.” They found that when a model judges an output that looks like itself—same architecture, same training lineage, or same family—it tends to give a higher score. Not because the output is objectively better, but because it “feels familiar.” That’s not evaluation, that’s model nepotism.
In this episode of Pop Goes the Stack, F5's Lori MacVittie, Joel Moses, and Ken Arora explore the concept of preference leakage in AI judgement systems. Tune in to understand the risks, the impact on the enterprise, and actionable strategies to improve model fairness, security, and reliability.