AI Judges Infect Each Other With Bias in Multi-Agent Systems

AI evaluation bias turns out to be contagious.

Researchers have published a formal framework called Contagion Networks that measures how evaluation biases spread when large language models act as judges in multi-agent pipelines. In a controlled experiment using three DeepSeek-chat agents — each given a different evaluator profile: structured, balanced, or evidence-based — they measured how strongly one agent's biases bled into another's. The answer: consistently, with contagion coefficients ranging from 0.157 to 0.352. That is lower than the 0.85-1.3 range seen in earlier cross-model work, but it still means bias is spreading even when every agent runs on the same underlying model.

This matters because multi-agent LLM systems are increasingly used to grade, filter, and rank AI-generated output — including in automated research pipelines and model evaluation benchmarks. If the evaluator agents share a systematic blind spot, that blind spot does not cancel out; it propagates. The finding puts a concrete number on a risk that most builders of these systems have treated as theoretical.

The paper also offers a practical fix: expanding the evaluator committee from one judge to three cuts effective contagion by 72.4%. That is a meaningful lever, though it also means tripling inference costs for every evaluation step. The researchers released their experimental framework as open-source, so teams can measure their own pipelines' contagion coefficients before assuming the problem does not apply to them.

← Back to the front page