Sample Factory: Egocentric 3D Control from Pixels at 100000 FPS with Asynchronous Reinforcement Learning

Abstract

Increasing the scale of reinforcement learning experiments has allowedresearchers to achieve unprecedented results in both training sophisticatedagents for video games, and in sim-to-real transfer for robotics. Typicallysuch experiments rely on large distributed systems and require expensivehardware setups, limiting wider access to this exciting area of research. Inthis work we aim to solve this problem by optimizing the efficiency andresource utilization of reinforcement learning algorithms instead of relying ondistributed computation. We present the "Sample Factory", a high-throughputtraining system optimized for a single-machine setting. Our architecturecombines a highly efficient, asynchronous, GPU-based sampler with off-policycorrection techniques, allowing us to achieve throughput higher than 10^5environment frames/second on non-trivial control problems in 3D withoutsacrificing sample efficiency. We extend Sample Factory to support self-playand population-based training and apply these techniques to train highlycapable agents for a multiplayer first-person shooter game. Github:https://github.com/alex-petrenko/sample-factory

Quick Read (beta)

loading the full paper ...