Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning

Abstract

Effective decision-making in partially observable environments demands robustmemory management. Despite their success in supervised learning, currentdeep-learning memory models struggle in reinforcement learning environmentsthat are partially observable and long-term. They fail to efficiently capturerelevant past information, adapt flexibly to changing observations, andmaintain stable updates over long episodes. We theoretically analyze thelimitations of existing memory models within a unified framework and introducethe Stable Hadamard Memory, a novel memory model for reinforcement learningagents. Our model dynamically adjusts memory by erasing no longer neededexperiences and reinforcing crucial ones computationally efficiently. To thisend, we leverage the Hadamard product for calibrating and updating memory,specifically designed to enhance memory capacity while mitigating numerical andlearning challenges. Our approach significantly outperforms state-of-the-artmemory-based methods on challenging partially observable benchmarks, such asmeta-reinforcement learning, long-horizon credit assignment, and POPGym,demonstrating superior performance in handling long-term and evolving contexts.

Quick Read (beta)

loading the full paper ...