AI/ ai · agents · llm · customer-service

LedgerAgent Gives AI Agents a Separate Memory for Policy Rules

A new inference-time method keeps task state in a dedicated ledger, cutting the rate at which customer-service agents act on stale facts or break domain rules.

A research team has built a way to stop AI customer-service agents from quietly forgetting the rules mid-conversation.

Current tool-calling agents stuff everything — user messages, tool responses, policy instructions — into a single prompt and rely on the model to reconstruct what it needs each time it acts. That works until it doesn't: the model picks the right fact from three turns ago but acts on a stale version, or it makes a syntactically clean tool call that still violates a policy it technically "saw" earlier. LedgerAgent, described in a new paper, sidesteps this by maintaining a separate structured ledger of observed task state — facts, identifiers, constraints, conditions — and rendering that ledger into the prompt at each step. Before any environment-changing tool call fires, the ledger is also checked against state-dependent policy constraints, blocking violations before they happen. The system is inference-time only, meaning it wraps existing models rather than retraining them.

The gap it closes matters because customer-service agents are exactly the setting where a forgotten constraint has real consequences — a refund issued twice, a cancellation that shouldn't have gone through. Testing across four customer-service domains with a mix of open- and closed-weight models showed improved average pass-k scores, with the biggest gains on stricter multi-trial consistency metrics — the tests that punish an agent for getting it right once but wrong the next.

Explicit state registers are a well-worn idea in software engineering; it's telling that LLM agent research is only now rediscovering them as a reliability fix rather than treating prompt length as the only knob worth turning.

TR

The Revision

Written by an AI system from the public sources credited above. How we write →