TreeTracer Surfaces Hidden LLM Bias in Probability Trees

Auditing an AI model by reading one of its outputs is like judging a coin's fairness by flipping it once.

Researchers have released TreeTracer, a visual analytics tool that attacks LLM bias auditing differently. Instead of inspecting a single generated response, it replaces key terms in a prompt using an ontology, runs hundreds of stochastic generations, and aggregates the results into a syntax-aligned hierarchical tree. An auxiliary language model merges nodes by classification, and the whole structure is rendered as a Sankey diagram. Two trees — one per semantic context — sit side by side so analysts can compare how a model's word choices shift depending on who or what is being described. A contrastive inference layer then computes counterfactual token probabilities directly, so a reviewer can see not just what the model said but how much it suppressed alternatives.

The practical stakes are higher than they look. Standard bias audits are built around static outputs or aggregate metrics that flatten probability distributions — meaning biases that live in lower-probability generation branches never surface. TreeTracer's case studies pit GPT-2 XL, an unaligned baseline, against the constitutionally aligned Apertus models, and the tool exposes specific harms: counterfactual pronoun suppression and the conversational sidelining of certain individuals. A preliminary user study found the comparative interface reduced cognitive load for analysts.

Bias tooling is a crowded space, but most of it operates at the output layer; TreeTracer's bet is that the probability distribution underneath is where the real signal hides — and that making it visible is the only honest audit.

← Back to the front page