FastTD3: Simple, Fast, and Capable Reinforcement Learning for Humanoid Control

Abstract

Reinforcement learning (RL) has driven significant progress in robotics, butits complexity and long training times remain major bottlenecks. In thisreport, we introduce FastTD3, a simple, fast, and capable RL algorithm thatsignificantly speeds up training for humanoid robots in popular suites such asHumanoidBench, IsaacLab, and MuJoCo Playground. Our recipe is remarkablysimple: we train an off-policy TD3 agent with several modifications -- parallelsimulation, large-batch updates, a distributional critic, and carefully tunedhyperparameters. FastTD3 solves a range of HumanoidBench tasks in under 3 hourson a single A100 GPU, while remaining stable during training. We also provide alightweight and easy-to-use implementation of FastTD3 to accelerate RL researchin robotics.

Quick Read (beta)

loading the full paper ...