Security Concerns for Large Language Models: A Survey

Abstract

Large Language Models (LLMs) such as ChatGPT and its competitors have causeda revolution in natural language processing, but their capabilities alsointroduce new security vulnerabilities. This survey provides a comprehensiveoverview of these emerging concerns, categorizing threats into several keyareas: prompt injection and jailbreaking; adversarial attacks, including inputperturbations and data poisoning; misuse by malicious actors to generatedisinformation, phishing emails, and malware; and the worrisome risks inherentin autonomous LLM agents. Recently, a significant focus is increasingly beingplaced on the latter, exploring goal misalignment, emergent deception,self-preservation instincts, and the potential for LLMs to develop and pursuecovert, misaligned objectives, a behavior known as scheming, which may evenpersist through safety training. We summarize recent academic and industrialstudies from 2022 to 2025 that exemplify each threat, analyze proposed defensesand their limitations, and identify open challenges in securing LLM-basedapplications. We conclude by emphasizing the importance of advancing robust,multi-layered security strategies to ensure LLMs are safe and beneficial.

Quick Read (beta)

loading the full paper ...