Hierarchical Continual Reinforcement Learning via Large Language Model

Abstract

The ability to learn continuously in dynamic environments is a crucialrequirement for reinforcement learning (RL) agents applying in the real world.Despite the progress in continual reinforcement learning (CRL), existingmethods often suffer from insufficient knowledge transfer, particularly whenthe tasks are diverse. To address this challenge, we propose a new framework,Hierarchical Continual reinforcement learning via large language model(Hi-Core), designed to facilitate the transfer of high-level knowledge. Hi-Coreorchestrates a twolayer structure: high-level policy formulation by a largelanguage model (LLM), which represents agenerates a sequence of goals, andlow-level policy learning that closely aligns with goal-oriented RL practices,producing the agent's actions in response to the goals set forth. The frameworkemploys feedback to iteratively adjust and verify highlevel policies, storingthem along with low-level policies within a skill library. When encountering anew task, Hi-Core retrieves relevant experience from this library to help tolearning. Through experiments on Minigrid, Hi-Core has demonstrated itseffectiveness in handling diverse CRL tasks, which outperforms popularbaselines.

Quick Read (beta)

loading the full paper ...