PreND: Enhancing Intrinsic Motivation in Reinforcement Learning through Pre-trained Network Distillation

  • 2024-10-02 17:56:03
  • Mohammadamin Davoodabadi, Negin Hashemi Dijujin, Mahdieh Soleymani Baghshah
  • 0

Abstract

Intrinsic motivation, inspired by the psychology of developmental learning ininfants, stimulates exploration in agents without relying solely on sparseexternal rewards. Existing methods in reinforcement learning like RandomNetwork Distillation (RND) face significant limitations, including (1) relyingon raw visual inputs, leading to a lack of meaningful representations, (2) theinability to build a robust latent space, (3) poor target networkinitialization and (4) rapid degradation of intrinsic rewards. In this paper,we introduce Pre-trained Network Distillation (PreND), a novel approach toenhance intrinsic motivation in reinforcement learning (RL) by improving uponthe widely used prediction-based method, RND. PreND addresses these challengesby incorporating pre-trained representation models into both the target andpredictor networks, resulting in more meaningful and stable intrinsic rewards,while enhancing the representation learned by the model. We also tried simplebut effective variants of the predictor network optimization by controlling thelearning rate. Through experiments on the Atari domain, we demonstrate thatPreND significantly outperforms RND, offering a more robust intrinsicmotivation signal that leads to better exploration, improving overallperformance and sample efficiency. This research highlights the importance oftarget and predictor networks representation in prediction-based intrinsicmotivation, setting a new direction for improving RL agents' learningefficiency in sparse reward environments.

 

Quick Read (beta)

loading the full paper ...