Distributed Deep Reinforcement Learning: Learn how to play Atari games in 21 minutes

  • 2018-04-09 15:36:09
  • Igor Adamski, Robert Adamski, Tomasz Grel, Adam JÄ™drych, Kamil Kaczmarek, Henryk Michalewski
  • 0

Abstract

We present a study in Distributed Deep Reinforcement Learning (DDRL) focusedon scalability of a state-of-the-art Deep Reinforcement Learning algorithmknown as Batch Asynchronous Advantage ActorCritic (BA3C). We show that usingthe Adam optimization algorithm with a batch size of up to 2048 is a viablechoice for carrying out large scale machine learning computations. This,combined with careful reexamination of the optimizer's hyperparameters, usingsynchronous training on the node level (while keeping the local, single nodepart of the algorithm asynchronous) and minimizing the memory footprint of themodel, allowed us to achieve linear scaling for up to 64 CPU nodes. Thiscorresponds to a training time of 21 minutes on 768 CPU cores, as opposed to 10hours when using a single node with 24 cores achieved by a baseline single-nodeimplementation.

 

Quick Read (beta)

loading the full paper ...