Exploration in Deep Reinforcement Learning: A Comprehensive Survey

Abstract

Deep Reinforcement Learning (DRL) and Deep Multi-agent Reinforcement Learning(MARL) have achieved significant successes across a wide range of domains,including game AI, autonomous vehicles, robotics, and so on. However, DRL anddeep MARL agents are widely known to be sample inefficient that millions ofinteractions are usually needed even for relatively simple problem settings,thus preventing the wide application and deployment in real-industry scenarios.One bottleneck challenge behind is the well-known exploration problem, i.e.,how efficiently exploring the environment and collecting informativeexperiences that could benefit policy learning towards the optimal ones. Thisproblem becomes more challenging in complex environments with sparse rewards,noisy distractions, long horizons, and non-stationary co-learners. In thispaper, we conduct a comprehensive survey on existing exploration methods forboth single-agent and multi-agent RL. We start the survey by identifyingseveral key challenges to efficient exploration. Beyond the above two mainbranches, we also include other notable exploration methods with differentideas and techniques. In addition to algorithmic analysis, we provide acomprehensive and unified empirical comparison of different exploration methodsfor DRL on a set of commonly used benchmarks. According to our algorithmic andempirical investigation, we finally summarize the open problems of explorationin DRL and deep MARL and point out a few future directions.

Quick Read (beta)

loading the full paper ...