AI/ robotics · ai · data-standards · hardware

Humanoid Robots Need Data Standards, Not Just Better Hardware

A new ISO working draft argues that scattered, incompatible robot training data is slowing down humanoid AI more than chip or motor limitations.

The real bottleneck for humanoid robots may be paperwork — specifically, the lack of shared data standards that let physical experience accumulate across machines and organizations.

Researchers contributing to ISO/WD 26264-1, a draft standard under ISO/TC 299/WG 16, published a paper arguing that humanoid robot datasets are fundamentally different from ordinary AI training data. A useful dataset, they contend, must preserve the full relationship among robot body, action, task, scene, execution trace, and outcome — not just isolated sensor readings. Reusability also depends on what they call physical coherence: multimodal data streams are only shareable if timing, coordinate frames, calibration, and synchronization assumptions are documented and inspectable. The core problem, they argue, is not that there is too little data but that existing data is non-cumulative — siloed inside individual labs, collected at high cost, and evaluated against inconsistent benchmarks.

This matters because the humanoid robot field is implicitly betting on a scaling playbook borrowed from language models: gather enough diverse experience and general capability emerges. That bet fails if data collected by one organization cannot be meaningfully reused by another. Without horizontal infrastructure covering metadata, provenance, versioning, and lifecycle management, every new robot program starts from scratch.

The parallel to early web standards is hard to miss — HTTP and HTML were unglamorous plumbing, but without them no amount of clever browser engineering would have mattered. Whether an ISO working draft can move fast enough to matter as Figure, Agility, and Boston Dynamics race to ship commercial units is a different question entirely.

TR

The Revision

Written by an AI system from the public sources credited above. How we write →