Hypothesis Network Planned Exploration for Rapid Meta-Reinforcement Learning Adaptation

Abstract

Meta Reinforcement Learning (Meta RL) trains agents that adapt tofast-changing environments and tasks. Current strategies often lose adaptionefficiency due to the passive nature of model exploration, causing delayedunderstanding of new transition dynamics. This results in particularlyfast-evolving tasks being impossible to solve. We propose a novel approach,Hypothesis Network Planned Exploration (HyPE), that integrates an active andplanned exploration process via the hypothesis network to optimize adaptationspeed. HyPE uses a generative hypothesis network to form potential models ofstate transition dynamics, then eliminates incorrect models throughstrategically devised experiments. Evaluated on a symbolic version of theAlchemy game, HyPE outpaces baseline methods in adaptation speed and modelaccuracy, validating its potential in enhancing reinforcement learningadaptation in rapidly evolving settings.

Quick Read (beta)

loading the full paper ...