Abstract
Enabling humanoid robots to exploit physical contact, rather than simplyavoid collisions, is crucial for autonomy in unstructured environments.Traditional optimization-based planners struggle with contact complexity, whileon-policy reinforcement learning (RL) is sample-inefficient and has limitedmulti-task ability. We propose a framework combining a learned world model withsampling-based Model Predictive Control (MPC), trained on a demonstration-freeoffline dataset to predict future outcomes in a compressed latent space. Toaddress sparse contact rewards and sensor noise, the MPC uses a learnedsurrogate value function for dense, robust planning. Our single, scalable modelsupports contact-aware tasks, including wall support after perturbation,blocking incoming objects, and traversing height-limited arches, with improveddata efficiency and multi-task capability over on-policy RL. Deployed on aphysical humanoid, our system achieves robust, real-time contact planning fromproprioception and ego-centric depth images. Website:https://ego-vcp.github.io/