Do recent advancements in model-based deep reinforcement learning really improve data efficiency?

Abstract

Reinforcement learning (RL) has seen great advancements in the past fewyears. Nevertheless, the consensus among the RL community is that currentlyused model-free methods, despite all their benefits, suffer from extreme datainefficiency. To circumvent this problem, novel model-based approaches wereintroduced that often claim to be much more efficient than their model-freecounterparts. In this paper, however, we demonstrate that the state-of-the-artmodel-free Rainbow DQN algorithm can be trained using a much smaller number ofsamples than it is commonly reported. By simply allowing the algorithm toexecute network updates more frequently we manage to reach similar or betterresults than existing model-based techniques, at a fraction of complexity andcomputational costs. Furthermore, based on the outcomes of the study, we arguethat the agent similar to the modified Rainbow DQN that is presented in thispaper should be used as a baseline for any future work aimed at improvingsample efficiency of deep reinforcement learning.

Quick Read (beta)

loading the full paper ...