Learning Humanoid Locomotion with Transformers

Abstract

We present a sim-to-real learning-based approach for real-world humanoidlocomotion. Our controller is a causal Transformer trained by autoregressiveprediction of future actions from the history of observations and actions. Wehypothesize that the observation-action history contains useful informationabout the world that a powerful Transformer model can use to adapt its behaviorin-context, without updating its weights. We do not use state estimation,dynamics models, trajectory optimization, reference trajectories, orpre-computed gait libraries. Our controller is trained with large-scalemodel-free reinforcement learning on an ensemble of randomized environments insimulation and deployed to the real world in a zero-shot fashion. We evaluateour approach in high-fidelity simulation and successfully deploy it to the realrobot as well. To the best of our knowledge, this is the first demonstration ofa fully learning-based method for real-world full-sized humanoid locomotion.

Quick Read (beta)

loading the full paper ...