Abstract
The rapid progress of graph generation has raised new security concerns,particularly regarding backdoor vulnerabilities. While prior work has exploredbackdoor attacks in image diffusion and unconditional graph generation,conditional, especially text-guided graph generation remains largelyunexamined. This paper proposes BadGraph, a backdoor attack method targetinglatent diffusion models for text-guided graph generation. BadGraph leveragestextual triggers to poison training data, covertly implanting backdoors thatinduce attacker-specified subgraphs during inference when triggers appear,while preserving normal performance on clean inputs. Extensive experiments onfour benchmark datasets (PubChem, ChEBI-20, PCDes, MoMu) demonstrate theeffectiveness and stealth of the attack: less than 10% poisoning rate canachieves 50% attack success rate, while 24% suffices for over 80% success rate,with negligible performance degradation on benign samples. Ablation studiesfurther reveal that the backdoor is implanted during VAE and diffusion trainingrather than pretraining. These findings reveal the security vulnerabilities inlatent diffusion models of text-guided graph generation, highlight the seriousrisks in models' applications such as drug discovery and underscore the needfor robust defenses against the backdoor attack in such diffusion models.