Abstract
Teleoperating humanoid robots in a whole-body manner marks a fundamental steptoward developing general-purpose robotic intelligence, with human motionproviding an ideal interface for controlling all degrees of freedom. Yet, mostcurrent humanoid teleoperation systems fall short of enabling coordinatedwhole-body behavior, typically limiting themselves to isolated locomotion ormanipulation tasks. We present the Teleoperated Whole-Body Imitation System(TWIST), a system for humanoid teleoperation through whole-body motionimitation. We first generate reference motion clips by retargeting human motioncapture data to the humanoid robot. We then develop a robust, adaptive, andresponsive whole-body controller using a combination of reinforcement learningand behavior cloning (RL+BC). Through systematic analysis, we demonstrate howincorporating privileged future motion frames and real-world motion capture(MoCap) data improves tracking accuracy. TWIST enables real-world humanoidrobots to achieve unprecedented, versatile, and coordinated whole-body motorskills--spanning whole-body manipulation, legged manipulation, locomotion, andexpressive movement--using a single unified neural network controller. Ourproject website: https://humanoid-teleop.github.io