Acme: A Research Framework for Distributed Reinforcement Learning

  • 2022-09-20 18:15:51
  • Matthew W. Hoffman, Bobak Shahriari, John Aslanides, Gabriel Barth-Maron, Nikola Momchev, Danila Sinopalnikov, Piotr Stańczyk, Sabela Ramos, Anton Raichuk, Damien Vincent, Léonard Hussenot, Robert Dadashi, Gabriel Dulac-Arnold, Manu Orsini, Alexis Jacq, Johan Ferret, Nino Vieillard, Seyed Kamyar Seyed Ghasemipour, Sertan Girgin, Olivier Pietquin, Feryal Behbahani, Tamara Norman, Abbas Abdolmaleki, Albin Cassirer, Fan Yang, Kate Baumli, Sarah Henderson, Abe Friesen, Ruba Haroun, Alex Novikov, Sergio Gómez Colmenarejo, Serkan Cabi, Caglar Gulcehre, Tom Le Paine, Srivatsan Srinivasan, Andrew Cowie, Ziyu Wang, Bilal Piot, Nando de Freitas
  • 0

Abstract

Deep reinforcement learning (RL) has led to many recent and groundbreakingadvances. However, these advances have often come at the cost of both increasedscale in the underlying architectures being trained as well as increasedcomplexity of the RL algorithms used to train them. These increases have inturn made it more difficult for researchers to rapidly prototype new ideas orreproduce published RL algorithms. To address these concerns this workdescribes Acme, a framework for constructing novel RL algorithms that isspecifically designed to enable agents that are built using simple, modularcomponents that can be used at various scales of execution. While the primarygoal of Acme is to provide a framework for algorithm development, a secondarygoal is to provide simple reference implementations of important orstate-of-the-art algorithms. These implementations serve both as a validationof our design decisions as well as an important contribution to reproducibilityin RL research. In this work we describe the major design decisions made withinAcme and give further details as to how its components can be used to implementvarious algorithms. Our experiments provide baselines for a number of commonand state-of-the-art algorithms as well as showing how these algorithms can bescaled up for much larger and more complex environments. This highlights one ofthe primary advantages of Acme, namely that it can be used to implement large,distributed RL algorithms that can run at massive scales while stillmaintaining the inherent readability of that implementation. This work presents a second version of the paper which coincides with anincrease in modularity, additional emphasis on offline, imitation and learningfrom demonstrations algorithms, as well as various new agents implemented aspart of Acme.

 

Quick Read (beta)

loading the full paper ...