Diversity is All You Need: Learning Skills without a Reward Function

Abstract

Intelligent creatures can explore their environments and learn useful skillswithout supervision. In this paper, we propose DIAYN ("Diversity is All YouNeed"), a method for learning useful skills without a reward function. Ourproposed method learns skills by maximizing an information theoretic objectiveusing a maximum entropy policy. On a variety of simulated robotic tasks, weshow that this simple objective results in the unsupervised emergence ofdiverse skills, such as walking and jumping. In a number of reinforcementlearning benchmark environments, our method is able to learn a skill thatsolves the benchmark task despite never receiving the true task reward. Inthese environments, some of the learned skills correspond to solving the task,and each skill that solves the task does so in a distinct manner. Our resultssuggest that unsupervised discovery of skills can serve as an effectivepretraining mechanism for overcoming challenges of exploration and dataefficiency in reinforcement learning

Quick Read (beta)

loading the full paper ...