Improved Exploring Starts by Kernel Density Estimation-Based State-Space Coverage Acceleration in Reinforcement Learning

Abstract

Reinforcement learning (RL) is currently a popular research topic in controlengineering and has the potential to make its way to industrial and commercialapplications. Corresponding RL controllers are trained in direct interactionwith the controlled system, rendering them data-driven and performance-orientedsolutions. The best practice of exploring starts (ES) is used by default tosupport the learning process via randomly picked initial states. However, thismethod might deliver strongly biased results if the system's dynamic andconstraints lead to unfavorable sample distributions in the state space (e.g.,condensed sample accumulation in certain state-space areas). To overcome thisissue, a kernel density estimation-based state-space coverage acceleration(DESSCA) is proposed, which improves the ES concept by prioritizinginfrequently visited states for a more balanced coverage of the state spaceduring training. Compared to neighbouring methods in the field of count-basedexploration, DESSCA can also be applied to continuous state spaces without theneed for artificial discretization of the states. Moreover, the algorithmallows to define arbitrary reference state distributions such that the statecoverage can be shaped w.r.t. the application needs. Considered test scenariosare mountain car, cartpole and electric motor control environments. Using DQNand DDPG as exemplary RL algorithms, it can be shown that DESSCA is a simpleyet effective algorithmic extension to the established ES approach that enablesan increase in learning stability as well as the final control performance.

Quick Read (beta)

loading the full paper ...