On the Resilience of Multi-Agent Systems with Malicious Agents

Abstract

Multi-agent systems, powered by large language models, have shown greatabilities across various tasks due to the collaboration of expert agents, eachfocusing on a specific domain. However, when agents are deployed separately,there is a risk that malicious users may introduce malicious agents whogenerate incorrect or irrelevant results that are too stealthy to be identifiedby other non-specialized agents. Therefore, this paper investigates twoessential questions: (1) What is the resilience of various multi-agent systemstructures (e.g., A$\rightarrow$B$\rightarrow$C,A$\leftrightarrow$B$\leftrightarrow$C) under malicious agents, on differentdownstream tasks? (2) How can we increase system resilience to defend againstmalicious agents? To simulate malicious agents, we devise two methods,AutoTransform and AutoInject, to transform any agent into a malicious one whilepreserving its functional integrity. We run comprehensive experiments on fourdownstream multi-agent systems tasks, namely code generation, math problems,translation, and text evaluation. Results suggest that the "hierarchical"multi-agent structure, i.e., A$\rightarrow$(B$\leftrightarrow$C), exhibitssuperior resilience with the lowest performance drop of $23.6\%$, compared to$46.4\%$ and $49.8\%$ of other two structures. Additionally, we show thepromise of improving multi-agent system resilience by demonstrating that twodefense methods, introducing a mechanism for each agent to challenge others'outputs, or an additional agent to review and correct messages, can enhancesystem resilience. Our code and data are available athttps://github.com/CUHK-ARISE/MAS-Resilience.

Quick Read (beta)

loading the full paper ...