Abstract
Agents trained via reinforcement learning (RL) often struggle to perform wellon tasks that differ from those encountered during training. This limitationpresents a challenge to the broader deployment of RL in diverse and dynamictask settings. In this work, we introduce memory augmentation, a memory-basedRL approach to improve task generalization. Our approach leveragestask-structured augmentations to simulate plausible out-of-distributionscenarios and incorporates memory mechanisms to enable context-aware policyadaptation. Trained on a predefined set of tasks, our policy demonstrates theability to generalize to unseen tasks through memory augmentation withoutrequiring additional interactions with the environment. Through extensivesimulation experiments and real-world hardware evaluations on legged locomotiontasks, we demonstrate that our approach achieves zero-shot generalization tounseen tasks while maintaining robust in-distribution performance and highsample efficiency.