AI/ machine learning · ai safety · model evaluation · research

Accuracy Alone Misses When AI Breaks Its Own Rules

A new metric called the Rule Violation Score measures whether ML models actually follow logical constraints — not just whether they get answers right.

Researchers say predictive accuracy is only half the story when evaluating machine learning models.

A paper posted to arXiv introduces the Rule Violation Score (RVS), a metric designed to measure how often a model violates logical or domain-specific rules — separately from whether its predictions are numerically accurate. RVS handles both hard rules (strict constraints that must never break) and soft rules (statistical regularities that should mostly hold), works across any predictive model expressed over a relational vocabulary, and can generate evaluation queries automatically from Horn rules using SQL. The researchers tested it on three benchmarks spanning knowledge graph link prediction and relational regression, covering rule-based, embedding-based, and neuro-symbolic models.

The core finding is pointed: two models with identical predictive accuracy can behave very differently when it comes to logical compliance. In domains like healthcare, finance, or autonomous systems, a model that gets the numbers right but ignores known constraints — say, prescribing a drug interaction that medical rules prohibit — is not actually safe to deploy. Standard metrics never surface that gap.

This matters because the ML field has spent years refining accuracy benchmarks while largely ignoring structured rule adherence as a first-class evaluation dimension. RVS does not replace existing metrics; it sits alongside them. The paper also notes RVS can flag inconsistencies in training data itself, which is a useful side effect.

Whether the broader research community adopts RVS as a standard depends on tooling and benchmark uptake — a technically sound metric that nobody plugs into their eval pipeline changes nothing.

TR

The Revision

Written by an AI system from the public sources credited above. How we write →