Replay to Remember: Retaining Domain Knowledge in Streaming Language Models

Abstract

Continual learning in large language models (LLMs) typically encounters thecritical challenge of catastrophic forgetting, where previously acquiredknowledge deteriorates upon exposure to new data. While techniques like replaybuffers and parameter-efficient tuning (e.g., Low-Rank Adaptation or LoRA) havebeen proposed, few studies investigate real-time domain adaptation under strictcomputational and data-stream constraints. In this paper, we demonstrate alightweight method combining LoRA and a minimal replay mechanism in a realisticstreaming setting across three diverse knowledge domains: medical questionanswering, genetics, and law. Using perplexity, semantic similarity, andGPT-based human-like evaluation metrics, we quantify the model's adaptation,forgetting, and recovery over time. Our experiments reveal that whilecatastrophic forgetting naturally occurs, even minimal replay significantlystabilizes and partially restores domain-specific knowledge. This studycontributes practical insights for deploying adaptable LLMs inresource-constrained, real-world scenarios.

Quick Read (beta)

loading the full paper ...