Agent57: Outperforming the Atari Human Benchmark

Abstract

Atari games have been a long-standing benchmark in the reinforcement learning(RL) community for the past decade. This benchmark was proposed to test generalcompetency of RL algorithms. Previous work has achieved good averageperformance by doing outstandingly well on many games of the set, but verypoorly in several of the most challenging games. We propose Agent57, the firstdeep RL agent that outperforms the standard human benchmark on all 57 Atarigames. To achieve this result, we train a neural network which parameterizes afamily of policies ranging from very exploratory to purely exploitative. Wepropose an adaptive mechanism to choose which policy to prioritize throughoutthe training process. Additionally, we utilize a novel parameterization of thearchitecture that allows for more consistent and stable learning.

Quick Read (beta)

loading the full paper ...