Unsupervised Meta-Learning for Reinforcement Learning

Abstract

Meta-learning is a powerful tool that builds on multi-task learning to learnhow to quickly adapt a model to new tasks. In the context of reinforcementlearning, meta-learning algorithms can acquire reinforcement learningprocedures to solve new problems more efficiently by meta-learning prior tasks.The performance of meta-learning algorithms critically depends on the tasksavailable for meta-training: in the same way that supervised learningalgorithms generalize best to test points drawn from the same distribution asthe training points, meta-learning methods generalize best to tasks from thesame distribution as the meta-training tasks. In effect, meta-reinforcementlearning offloads the design burden from algorithm design to task design. If wecan automate the process of task design as well, we can devise a meta-learningalgorithm that is truly automated. In this work, we take a step in thisdirection, proposing a family of unsupervised meta-learning algorithms forreinforcement learning. We describe a general recipe for unsupervisedmeta-reinforcement learning, and describe an effective instantiation of thisapproach based on a recently proposed unsupervised exploration technique andmodel-agnostic meta-learning. We also discuss practical and conceptualconsiderations for developing unsupervised meta-learning methods. Ourexperimental results demonstrate that unsupervised meta-reinforcement learningeffectively acquires accelerated reinforcement learning procedures without theneed for manual task design, significantly exceeds the performance of learningfrom scratch, and even matches performance of meta-learning methods that usehand-specified task distributions.

Quick Read (beta)

loading the full paper ...