Memory-R1: Enhancing Large Language Model Agents to Manage and Utilize Memories via Reinforcement Learning

  • 2025-10-08 16:54:13
  • Sikuan Yan, Xiufeng Yang, Zuchao Huang, Ercong Nie, Zifeng Ding, Zonggen Li, Xiaowen Ma, Kristian Kersting, Jeff Z. Pan, Hinrich Schütze, Volker Tresp, Yunpu Ma
  • 0

Abstract

Large Language Models (LLMs) have demonstrated impressive capabilities acrossa wide range of NLP tasks, but they remain fundamentally stateless, constrainedby limited context windows that hinder long-horizon reasoning. Recent effortsto address this limitation often augment LLMs with an external memory bank, yetmost existing pipelines are static and heuristic-driven, lacking a learnedmechanism for deciding what to store, update, or retrieve. We presentMemory-R1, a reinforcement learning (RL) framework that equips LLMs with theability to actively manage and utilize external memory through two specializedagents: a Memory Manager that learns structured operations, including ADD,UPDATE, DELETE, and NOOP; and an Answer Agent that pre-selects and reasons overrelevant entries. Both agents are fine-tuned with outcome-driven RL (PPO andGRPO), enabling adaptive memory management with minimal supervision. With only152 training QA pairs, Memory-R1 outperforms strong baselines and generalizesacross diverse question types, three benchmarks (LoCoMo, MSC, LongMemEval), andmultiple model scales (3B-14B).

 

Quick Read (beta)

loading the full paper ...