Abstract
Reinforcement learning (RL) holds great promise for enabling autonomousacquisition of complex robotic manipulation skills, but realizing thispotential in real-world settings has been challenging. We present ahuman-in-the-loop vision-based RL system that demonstrates impressiveperformance on a diverse set of dexterous manipulation tasks, including dynamicmanipulation, precision assembly, and dual-arm coordination. Our approachintegrates demonstrations and human corrections, efficient RL algorithms, andother system-level design choices to learn policies that achieve near-perfectsuccess rates and fast cycle times within just 1 to 2.5 hours of training. Weshow that our method significantly outperforms imitation learning baselines andprior RL approaches, with an average 2x improvement in success rate and 1.8xfaster execution. Through extensive experiments and analysis, we provideinsights into the effectiveness of our approach, demonstrating how it learnsrobust, adaptive policies for both reactive and predictive control strategies.Our results suggest that RL can indeed learn a wide range of complexvision-based manipulation policies directly in the real world within practicaltraining times. We hope this work will inspire a new generation of learnedrobotic manipulation techniques, benefiting both industrial applications andresearch advancements. Videos and code are available at our project websitehttps://hil-serl.github.io/.