AI agents are finding their way into legal discovery, and researchers say that without human checkpoints, a single early error can quietly invalidate an entire case's privilege review.
A paper published on arXiv identifies a failure mode the authors call "trajectory collapse": when an LLM agent misclassifies a document early in a multi-step review, that mistake compounds through the rest of the workflow without any visible alarm. To address this, the researchers propose a four-layer verification architecture covering planning, reasoning, execution, and uncertainty measurement. Their simulation on a synthetic document corpus found that requiring human escalation at calibrated uncertainty thresholds cut privilege-waiver risk by up to 61% compared to fully autonomous agents — while still routing fewer than one in four documents to an attorney for review.
The stakes are real: mishandling privilege review isn't a product bug, it's potential malpractice. The paper's "Human-on-the-Loop" framing is worth noting — it positions humans as a selective override rather than a constant presence, which is the only way AI-assisted review stays economically viable at the document volumes e-discovery typically involves.
This lands as law firms and legal tech vendors race to automate discovery workflows. The research doesn't argue against AI agents; it argues that deploying them without uncertainty-triggered escalation is the kind of shortcut that ends careers — and that the current defaults at many shops may already be dangerously close to that line.