Emergence of Hierarchy via Reinforcement Learning Using a Multiple Timescale Stochastic RNN

Abstract

Although recurrent neural networks (RNNs) for reinforcement learning (RL)have addressed unique advantages in various aspects, e. g., solvingmemory-dependent tasks and meta-learning, very few studies have demonstratedhow RNNs can solve the problem of hierarchical RL by autonomously developinghierarchical control. In this paper, we propose a novel model-free RL frameworkcalled ReMASTER, which combines an off-policy actor-critic algorithm with amultiple timescale stochastic recurrent neural network for solvingmemory-dependent and hierarchical tasks. We performed experiments using achallenging continuous control task and showed that: (1) Internalrepresentation necessary for achieving hierarchical control autonomouslydevelops through exploratory learning. (2) Stochastic neurons in RNNs enablefaster relearning when adapting to a new task which is a recomposition ofsub-goals previously learned.

Quick Read (beta)

loading the full paper ...