Mamba prediction model fails to reveal causal links in benchmark

A Mamba state‑space model trained for next‑step prediction does not reliably recover causal structure.

The authors built a reusable falsification benchmark that includes synthetic generators (VAR, Lorenz, CauseMe‑style), three intervention types (hard do, soft noise, random forcing), and edge‑provenance cards on three real datasets. Across five evaluation stages the claimed advantage of the Mamba bottleneck vanished. A plain linear bottleneck matched or beat it, tuned Lasso outperformed it on synthetic CauseMe benchmarks, and on the Lorenz‑96 real benchmark classical PCMCI and Granger methods clustered together ahead of the bottleneck. The apparent benefit from interventions shrank to a sample‑size artifact and disappeared under standard do‑interventions, persisting only under the non‑standard random‑forcing scheme – a pattern that also showed up in vanilla bivariate Granger tests.

Why it matters: The result knocks down a headline‑grabbing claim that next‑step prediction nets hidden causality. It reminds researchers that fancy architectures do not automatically confer causal insight and that rigorous, control‑rich benchmarks are essential. Practitioners seeking causal discovery should favour well‑understood linear tools unless they can demonstrate clear, method‑specific gains.

In short, the lasting contribution is the benchmark itself, not the Mamba bottleneck. The study underscores that without careful falsification, apparent causal breakthroughs may simply be artifacts of data size or intervention design.

← Back to the front page