Hypothesis-Driven Skill Discovery for Hierarchical Deep Reinforcement Learning

Abstract

Deep reinforcement learning encompasses many versatile tools for designinglearning agents that can perform well on a variety of high-dimensional visualtasks, ranging from video games to robotic manipulation. However, these methodstypically suffer from poor sample efficiency, partially because they strive tobe largely problem-agnostic. In this work, we demonstrate the utility of adifferent approach that is extremely sample efficient. Specifically, we proposethe Hypothesis Proposal and Evaluation (HyPE) algorithm, which utilizes a smallset of intuitive assumptions about the behavior of objects in the physicalworld (or in games that mimic physics) to automatically define and learnhierarchical skills in a highly efficient manner. HyPE does this by discoveringobjects from raw pixel data, generating hypotheses about the controllability ofobserved changes in object state, and learning a hierarchy of skills that cantest these hypotheses and control increasingly complex interactions withobjects. We demonstrate that HyPE can dramatically improve sample efficiencywhen learning a high-quality pixels-to-actions policy; in the popular benchmarktask, Breakout, HyPE learns an order of magnitude faster than common baselinereinforcement learning and evolutionary strategies for policy learning.

Quick Read (beta)

loading the full paper ...