Humanoid Policy ~ Human Policy

Abstract

Training manipulation policies for humanoid robots with diverse data enhancestheir robustness and generalization across tasks and platforms. However,learning solely from robot demonstrations is labor-intensive, requiringexpensive tele-operated data collection which is difficult to scale. This paperinvestigates a more scalable data source, egocentric human demonstrations, toserve as cross-embodiment training data for robot learning. We mitigate theembodiment gap between humanoids and humans from both the data and modelingperspectives. We collect an egocentric task-oriented dataset (PH2D) that isdirectly aligned with humanoid manipulation demonstrations. We then train ahuman-humanoid behavior policy, which we term Human Action Transformer (HAT).The state-action space of HAT is unified for both humans and humanoid robotsand can be differentiably retargeted to robot actions. Co-trained withsmaller-scale robot data, HAT directly models humanoid robots and humans asdifferent embodiments without additional supervision. We show that human dataimproves both generalization and robustness of HAT with significantly betterdata collection efficiency. Code and data: https://human-as-robot.github.io/

Quick Read (beta)

loading the full paper ...