MCP Safety Audit: LLMs with the Model Context Protocol Allow Major Security Exploits

Abstract

To reduce development overhead and enable seamless integration betweenpotential components comprising any given generative AI application, the ModelContext Protocol (MCP) (Anthropic, 2024) has recently been released andsubsequently widely adopted. The MCP is an open protocol that standardizes APIcalls to large language models (LLMs), data sources, and agentic tools. Byconnecting multiple MCP servers, each defined with a set of tools, resources,and prompts, users are able to define automated workflows fully driven by LLMs.However, we show that the current MCP design carries a wide range of securityrisks for end users. In particular, we demonstrate that industry-leading LLMsmay be coerced into using MCP tools to compromise an AI developer's systemthrough various attacks, such as malicious code execution, remote accesscontrol, and credential theft. To proactively mitigate these and relatedattacks, we introduce a safety auditing tool, MCPSafetyScanner, the firstagentic tool to assess the security of an arbitrary MCP server. MCPScanner usesseveral agents to (a) automatically determine adversarial samples given an MCPserver's tools and resources; (b) search for related vulnerabilities andremediations based on those samples; and (c) generate a security reportdetailing all findings. Our work highlights serious security issues withgeneral-purpose agentic workflows while also providing a proactive tool toaudit MCP server safety and address detected vulnerabilities before deployment. The described MCP server auditing tool, MCPSafetyScanner, is freely availableat: https://github.com/johnhalloran321/mcpSafetyScanner

Quick Read (beta)

loading the full paper ...