Knowledge Retention for Continual Model-Based Reinforcement Learning

Abstract

We propose DRAGO, a novel approach for continual model-based reinforcementlearning aimed at improving the incremental development of world models acrossa sequence of tasks that differ in their reward functions but not the statespace or dynamics. DRAGO comprises two key components: Synthetic ExperienceRehearsal, which leverages generative models to create synthetic experiencesfrom past tasks, allowing the agent to reinforce previously learned dynamicswithout storing data, and Regaining Memories Through Exploration, whichintroduces an intrinsic reward mechanism to guide the agent toward revisitingrelevant states from prior tasks. Together, these components enable the agentto maintain a comprehensive and continually developing world model,facilitating more effective learning and adaptation across diverseenvironments. Empirical evaluations demonstrate that DRAGO is able to preserveknowledge across tasks, achieving superior performance in various continuallearning scenarios.

Quick Read (beta)

loading the full paper ...