Criticality-Based Varying Step-Number Algorithm for Reinforcement Learning

Abstract

In the context of reinforcement learning we introduce the concept ofcriticality of a state, which indicates the extent to which the choice ofaction in that particular state influences the expected return. That is, astate in which the choice of action is more likely to influence the finaloutcome is considered as more critical than a state in which it is less likelyto influence the final outcome. We formulate a criticality-based varying step number algorithm (CVS) - aflexible step number algorithm that utilizes the criticality function providedby a human, or learned directly from the environment. We test it in threedifferent domains including the Atari Pong environment, Road-Tree environment,and Shooter environment. We demonstrate that CVS is able to outperform popularlearning algorithms such as Deep Q-Learning and Monte Carlo.

Quick Read (beta)

loading the full paper ...