Veriphi Challenges the Case for Certified AI Training

The assumption that certified training always beats adversarial training in neural network verification turns out to be wrong — at least according to new research behind a tool called Veriphi.

Veriphi is a GPU-accelerated system that stress-tests neural networks by combining fast adversarial attacks with formal mathematical guarantees using a method called alpha,beta-CROWN. The researchers ran it against MNIST and CIFAR-10 — two standard image datasets — across three training approaches: standard, adversarial, and certified. On MNIST, a relatively simple dataset with 784 input dimensions, a technique called Interval Bound Propagation hit 78% certified accuracy. On the more complex CIFAR-10, that same technique became nearly useless. PGD adversarial training took over there, reaching 94% certification at small perturbation sizes. The team also pushed the system to a 105.8-million-parameter model used in aerospace logistics, claiming a 5x speedup in verification time through what they call attack-guided falsification.

The finding matters because the AI safety and robustness field has long treated certified training as the more rigorous, production-ready option. If the better method is actually dataset-dependent, organizations deploying models in high-stakes environments — aviation, medical imaging, autonomous vehicles — cannot just pick a single hardening strategy and call it done. That raises the cost and complexity of verification at exactly the moment regulators are starting to demand it.

Veriphi is not the first verification framework to claim speedups, but it is one of the few tested at production parameter counts rather than toy models — a gap that has historically made benchmark results hard to trust in real deployments.

← Back to the front page