Abstract
Learning to reach goal states and learning diverse skills through mutualinformation (MI) maximization have been proposed as principled frameworks forself-supervised reinforcement learning, allowing agents to acquire broadlyapplicable multitask policies with minimal reward engineering. Starting from asimple observation that the standard goal-conditioned RL (GCRL) is encapsulatedby the optimization objective of variational empowerment, we discuss how GCRLand MI-based RL can be generalized into a single family of methods, which wename variational GCRL (VGCRL), interpreting variational MI maximization, orvariational empowerment, as representation learning methods that acquirefunctionally-aware state representations for goal reaching. This novelperspective allows us to: (1) derive simple but unexplored variants of GCRL tostudy how adding small representation capacity can already expand itscapabilities; (2) investigate how discriminator function capacity andsmoothness determine the quality of discovered skills, or latent goals, throughmodifying latent dimensionality and applying spectral normalization; (3) adapttechniques such as hindsight experience replay (HER) from GCRL to MI-based RL;and lastly, (4) propose a novel evaluation metric, named latent goal reaching(LGR), for comparing empowerment algorithms with different choices of latentdimensionality and discriminator parameterization. Through principledmathematical derivations and careful experimental studies, our work lays anovel foundation from which to evaluate, analyze, and develop representationlearning techniques in goal-based RL.