Abstract
Multi-agent systems (MAS) need to adaptively cope with dynamic environments,changing agent populations, and diverse tasks. However, most of the multi-agentsystems cannot easily handle them, due to the complexity of the state and taskspace. The social impact theory regards the complex influencing factors asforces acting on an agent, emanating from the environment, other agents, andthe agent's intrinsic motivation, referring to the social force. Inspired bythis concept, we propose a novel gradient-based state representation formulti-agent reinforcement learning. To non-trivially model the social forces,we further introduce a data-driven method, where we employ denoising scorematching to learn the social gradient fields (SocialGFs) from offline samples,e.g., the attractive or repulsive outcomes of each force. During interactions,the agents take actions based on the multi-dimensional gradients to maximizetheir own rewards. In practice, we integrate SocialGFs into the widely usedmulti-agent reinforcement learning algorithms, e.g., MAPPO. The empiricalresults reveal that SocialGFs offer four advantages for multi-agent systems: 1)they can be learned without requiring online interaction, 2) they demonstratetransferability across diverse tasks, 3) they facilitate credit assignment inchallenging reward settings, and 4) they are scalable with the increasingnumber of agents.