LightMem: Lightweight and Efficient Memory-Augmented Generation

Abstract

Despite their remarkable capabilities, Large Language Models (LLMs) struggleto effectively leverage historical interaction information in dynamic andcomplex environments. Memory systems enable LLMs to move beyond statelessinteractions by introducing persistent information storage, retrieval, andutilization mechanisms. However, existing memory systems often introducesubstantial time and computational overhead. To this end, we introduce a newmemory system called LightMem, which strikes a balance between the performanceand efficiency of memory systems. Inspired by the Atkinson-Shiffrin model ofhuman memory, LightMem organizes memory into three complementary stages. First,cognition-inspired sensory memory rapidly filters irrelevant informationthrough lightweight compression and groups information according to theirtopics. Next, topic-aware short-term memory consolidates these topic-basedgroups, organizing and summarizing content for more structured access. Finally,long-term memory with sleep-time update employs an offline procedure thatdecouples consolidation from online inference. Experiments on LongMemEval withGPT and Qwen backbones show that LightMem outperforms strong baselines inaccuracy (up to 10.9% gains) while reducing token usage by up to 117x, APIcalls by up to 159x, and runtime by over 12x. The code is available athttps://github.com/zjunlp/LightMem.

Quick Read (beta)

loading the full paper ...