Deep Hierarchical Reinforcement Learning Algorithm in Partially Observable Markov Decision Processes

Abstract

In recent years, reinforcement learning has achieved many remarkablesuccesses due to the growing adoption of deep learning techniques and the rapidgrowth in computing power. Nevertheless, it is well-known that flatreinforcement learning algorithms are often not able to learn well anddata-efficient in tasks having hierarchical structures, e.g. consisting ofmultiple subtasks. Hierarchical reinforcement learning is a principled approachthat is able to tackle these challenging tasks. On the other hand, manyreal-world tasks usually have only partial observability in which statemeasurements are often imperfect and partially observable. The problems of RLin such settings can be formulated as a partially observable Markov decisionprocess (POMDP). In this paper, we study hierarchical RL in POMDP in which thetasks have only partial observability and possess hierarchical properties. Wepropose a hierarchical deep reinforcement learning approach for learning inhierarchical POMDP. The deep hierarchical RL algorithm is proposed to apply toboth MDP and POMDP learning. We evaluate the proposed algorithm on variouschallenging hierarchical POMDP.

Quick Read (beta)

loading the full paper ...