Retrieval-Augmented Hierarchical in-Context Reinforcement Learning and Hindsight Modular Reflections for Task Planning with LLMs

Abstract

Large Language Models (LLMs) have demonstrated remarkable abilities invarious language tasks, making them promising candidates for decision-making inrobotics. Inspired by Hierarchical Reinforcement Learning (HRL), we proposeRetrieval-Augmented in-context reinforcement Learning (RAHL), a novel frameworkthat decomposes complex tasks into sub-tasks using an LLM-based high-levelpolicy, in which a complex task is decomposed into sub-tasks by a high-levelpolicy on-the-fly. The sub-tasks, defined by goals, are assigned to thelow-level policy to complete. To improve the agent's performance inmulti-episode execution, we propose Hindsight Modular Reflection (HMR), where,instead of reflecting on the full trajectory, we let the agent reflect onshorter sub-trajectories to improve reflection efficiency. We evaluated thedecision-making ability of the proposed RAHL in three benchmarkenvironments--ALFWorld, Webshop, and HotpotQA. The results show that RAHL canachieve an improvement in performance in 9%, 42%, and 10% in 5 episodes ofexecution in strong baselines. Furthermore, we also implemented RAHL on theBoston Dynamics SPOT robot. The experiment shows that the robot can scan theenvironment, find entrances, and navigate to new rooms controlled by the LLMpolicy.

Quick Read (beta)

loading the full paper ...