Abstract
This paper introduces LLM-MARL, a unified framework that incorporates largelanguage models (LLMs) into multi-agent reinforcement learning (MARL) toenhance coordination, communication, and generalization in simulated gameenvironments. The framework features three modular components of Coordinator,Communicator, and Memory, which dynamically generate subgoals, facilitatesymbolic inter-agent messaging, and support episodic recall. Training combinesPPO with a language-conditioned loss and LLM query gating. LLM-MARL isevaluated in Google Research Football, MAgent Battle, and StarCraft II. Resultsshow consistent improvements over MAPPO and QMIX in win rate, coordinationscore, and zero-shot generalization. Ablation studies demonstrate that subgoalgeneration and language-based messaging each contribute significantly toperformance gains. Qualitative analysis reveals emergent behaviors such as rolespecialization and communication-driven tactics. By bridging language modelingand policy learning, this work contributes to the design of intelligent,cooperative agents in interactive simulations. It offers a path forward forleveraging LLMs in multi-agent systems used for training, games, and human-AIcollaboration.