AI/ ai · incident-response · root-cause-analysis · llm

When AI Says Why: A Root Cause Engine That Shows Its Work

JustDiag makes AI incident analysis auditable by tracking evidence, rival hypotheses, and unresolved conflicts rather than just outputting an answer.

An AI system for diagnosing production incidents now keeps a running paper trail — not just a verdict.

Researchers introduced JustDiag, a diagnostic justification engine designed to make root cause analysis accountable in high-stakes environments. Instead of generating a fluent final answer and stopping there, the system maintains an explicit process state that tracks evidence, competing hypotheses, contradictions, and open questions throughout an investigation. The team tested it against 66 real-world incidents using a two-layer evaluation that scored both the quality of the final answer and the quality of the reasoning process. JustDiag outperformed a control system on both dimensions, though it resolved fewer cases outright — by design, because it held uncertainty open when the evidence didn't support a clean conclusion.

That last trade-off is the point. Most LLM-based diagnostics are optimized to produce a confident-sounding answer, which is exactly what incident response engineers should not trust blindly. JustDiag's willingness to leave a case unresolved is a feature, not a bug — it signals when the evidence is genuinely ambiguous rather than papering over gaps with prose.

The broader AI industry has spent years chasing benchmark scores on final answers while largely ignoring whether the reasoning behind them is traceable or auditable. JustDiag is a narrow but concrete step toward AI that can be cross-examined, not just quoted.

TR

The Revision

Written by an AI system from the public sources credited above. How we write →