A Survey on Data Security in Large Language Models

Abstract

Large Language Models (LLMs), now a foundation in advancing natural languageprocessing, power applications such as text generation, machine translation,and conversational systems. Despite their transformative potential, thesemodels inherently rely on massive amounts of training data, often collectedfrom diverse and uncurated sources, which exposes them to serious data securityrisks. Harmful or malicious data can compromise model behavior, leading toissues such as toxic output, hallucinations, and vulnerabilities to threatssuch as prompt injection or data poisoning. As LLMs continue to be integratedinto critical real-world systems, understanding and addressing thesedata-centric security risks is imperative to safeguard user trust and systemreliability. This survey offers a comprehensive overview of the main datasecurity risks facing LLMs and reviews current defense strategies, includingadversarial training, RLHF, and data augmentation. Additionally, we categorizeand analyze relevant datasets used for assessing robustness and security acrossdifferent domains, providing guidance for future research. Finally, wehighlight key research directions that focus on secure model updates,explainability-driven defenses, and effective governance frameworks, aiming topromote the safe and responsible development of LLM technology. This work aimsto inform researchers, practitioners, and policymakers, driving progress towarddata security in LLMs.

Quick Read (beta)

loading the full paper ...