A research paper out of arXiv describes a locomotion system that lets legged robots switch movement strategies on the fly — no terrain label required.
The system, called CTS-MoE, trains a quadruped robot using a dense mixture-of-experts neural network. Multiple specialized sub-networks handle different terrain types, and a perception-based gating mechanism routes decisions between them during deployment. A multi-critic design with task-specific value heads prevents the conflicting reward signals that typically plague multi-task reinforcement learning. The whole thing trains end-to-end in one stage, rather than the sequential teacher-then-student pipelines that dominate the field. Tests ran on a Unitree Go1 robot, both in simulation and on physical hardware, across terrain the model had and had not seen during training.
The gap this closes matters: prior approaches either used one monolithic policy that plays it safe on every surface, or hierarchical systems that struggle when terrain types blur together. CTS-MoE produces lower tracking error and higher success rates than monolithic baselines, suggesting the specialization is real and not just benchmark dressing. The fact that terrain classification happens implicitly — through perception alone — removes a brittle dependency that trips up robots in the real world.
Legged robotics has seen a wave of sim-to-hardware transfer work in recent years; what's notable here is the single-stage training approach, which the authors frame as both simpler and more general. Whether that holds outside a controlled hardware test is the next question to answer.