Why AI Agents Cannot Get By on Short-Term Memory

Researchers have a theorem that explains why truly general AI agents cannot wing it with only what they can see right now.

A paper posted to arXiv argues, through formal proof, that any agent operating across multiple environments will hit a fundamental wall if it relies solely on current observations. The core finding: when two domains look similar at some observation point but require different optimal actions, a capable agent must maintain distinct memory states for each domain at that point. The paper calls this a "separation theorem" — uniform near-optimality across environments is simply impossible without domain-relevant memory. It goes further: if an agent stores enough information to estimate goal-related values, that same memory is sufficient to approximately reconstruct the local dynamics of how its environment transitions.

This matters because it reframes memory not as a nice-to-have engineering feature but as a theoretical necessity. The implications cut across the current wave of long-context models and agent frameworks: raw context windows are not the same as structured, domain-tagged memory, and conflating the two may explain why multi-task agents still stumble when environments superficially resemble each other.

The AI field has spent years debating how much context a model needs; this paper suggests the more pressing question is what kind of information gets preserved and indexed — not just how many tokens fit in a window.

← Back to the front page