Abstract
Learning policies in simulation and transferring them to the real world hasbecome a promising approach in dexterous manipulation. However, bridging thesim-to-real gap for each new task requires substantial human effort, such ascareful reward engineering, hyperparameter tuning, and system identification.In this work, we present a system that leverages low-level skills to addressthese challenges for more complex tasks. Specifically, we introduce ahierarchical policy for in-hand object reorientation based on previouslyacquired rotation skills. This hierarchical policy learns to select whichlow-level skill to execute based on feedback from both the environment and thelow-level skill policies themselves. Compared to learning from scratch, thehierarchical policy is more robust to out-of-distribution changes and transferseasily from simulation to real-world environments. Additionally, we propose ageneralizable object pose estimator that uses proprioceptive information,low-level skill predictions, and control errors as inputs to estimate theobject pose over time. We demonstrate that our system can reorient objects,including symmetrical and textureless ones, to a desired pose.