CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning

Abstract

In reinforcement learning (RL), it is challenging to learn directly fromhigh-dimensional observations, where data augmentation has recently been shownto remedy this via encoding invariances from raw pixels. Nevertheless, weempirically find that not all samples are equally important and hence simplyinjecting more augmented inputs may instead cause instability in Q-learning. Inthis paper, we approach this problem systematically by developing amodel-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF), whichcan fully exploit sample importance and improve learning efficiency in aself-supervised manner. Facilitated by the proposed contrastive curiosity, CCLFis capable of prioritizing the experience replay, selecting the mostinformative augmented inputs, and more importantly regularizing the Q-functionas well as the encoder to concentrate more on under-learned data. Moreover, itencourages the agent to explore with a curiosity-based reward. As a result, theagent can focus on more informative samples and learn representationinvariances more efficiently, with significantly reduced augmented inputs. Weapply CCLF to several base RL algorithms and evaluate on the DeepMind ControlSuite, Atari, and MiniGrid benchmarks, where our approach demonstrates superiorsample efficiency and learning performances compared with otherstate-of-the-art methods.

Quick Read (beta)

loading the full paper ...