TensorFlow Agents: Efficient Batched Reinforcement Learning in TensorFlow

Abstract

We introduce TensorFlow Agents, an efficient infrastructure paradigm forbuilding parallel reinforcement learning algorithms in TensorFlow. We simulatemultiple environments in parallel, and group them to perform the neural networkcomputation on a batch rather than individual observations. This allows theTensorFlow execution engine to parallelize computation, without the need formanual synchronization. Environments are stepped in separate Python processesto progress them in parallel without interference of the global interpreterlock. As part of this project, we introduce BatchPPO, an efficientimplementation of the proximal policy optimization algorithm. By open sourcingTensorFlow Agents, we hope to provide a flexible starting point for futureprojects that accelerates future research in the field.

Quick Read (beta)

loading the full paper ...