Utilizing Evolution Strategies to Train Transformers in Reinforcement Learning

Abstract

We explore the capability of evolution strategies to train an agent with apolicy based on a transformer architecture in a reinforcement learning setting.We performed experiments using OpenAI's highly parallelizable evolutionstrategy to train Decision Transformer in the MuJoCo Humanoid locomotionenvironment and in the environment of Atari games, testing the ability of thisblack-box optimization technique to train even such relatively large andcomplicated models (compared to those previously tested in the literature). Theexamined evolution strategy proved to be, in general, capable of achievingstrong results and managed to produce high-performing agents, showcasingevolution's ability to tackle the training of even such complex models.

Quick Read (beta)

loading the full paper ...