Learning-Driven Exploration for Reinforcement Learning

Abstract

Effective and intelligent exploration has been an unresolved problem forreinforcement learning. Most contemporary reinforcement learning relies onsimple heuristic strategies such as $\epsilon$-greedy exploration or addingGaussian noise to actions. These heuristics, however, are unable tointelligently distinguish the well explored and the unexplored regions of statespace, which can lead to inefficient use of training time. We introduceentropy-based exploration (EBE) that enables an agent to explore efficientlythe unexplored regions of state space. EBE quantifies the agent's learning in astate using merely state-dependent action values and adaptively explores thestate space, i.e. more exploration for the unexplored region of the statespace. We perform experiments on a diverse set of environments and demonstratethat EBE enables efficient exploration that ultimately results in fasterlearning without having to tune any hyperparameter. The code to reproduce the experiments is given at\url{https://github.com/Usama1002/EBE-Exploration} and the supplementary videois given at \url{https://youtu.be/nJggIjjzKic}.

Quick Read (beta)

loading the full paper ...