Causal Knowledge Transfer for Multi-Agent Reinforcement Learning in Dynamic Environments

Abstract

[Context] Multi-agent reinforcement learning (MARL) has achieved notablesuccess in environments where agents must learn coordinated behaviors. However,transferring knowledge across agents remains challenging in non-stationaryenvironments with changing goals. [Problem] Traditional knowledge transfermethods in MARL struggle to generalize, and agents often require costlyretraining to adapt. [Approach] This paper introduces a causal knowledgetransfer framework that enables RL agents to learn and share compact causalrepresentations of paths within a non-stationary environment. As theenvironment changes (new obstacles), agents' collisions require adaptiverecovery strategies. We model each collision as a causal interventioninstantiated as a sequence of recovery actions (a macro) whose effectcorresponds to a causal knowledge of how to circumvent the obstacle whileincreasing the chances of achieving the agent's goal (maximizing cumulativereward). This recovery action macro is transferred online from a second agentand is applied in a zero-shot fashion, i.e., without retraining, just byquerying a lookup model with local context information (collisions). [Results]Our findings reveal two key insights: (1) agents with heterogeneous goals wereable to bridge about half of the gap between random exploration and a fullyretrained policy when adapting to new environments, and (2) the impact ofcausal knowledge transfer depends on the interplay between environmentcomplexity and agents' heterogeneous goals.

Quick Read (beta)

loading the full paper ...