Prompting Techniques for Secure Code Generation: A Systematic Investigation

Abstract

Large Language Models (LLMs) are gaining momentum in software developmentwith prompt-driven programming enabling developers to create code from naturallanguage (NL) instructions. However, studies have questioned their ability toproduce secure code and, thereby, the quality of prompt-generated software.Alongside, various prompting techniques that carefully tailor prompts haveemerged to elicit optimal responses from LLMs. Still, the interplay betweensuch prompting strategies and secure code generation remains under-explored andcalls for further investigations. OBJECTIVE: In this study, we investigate theimpact of different prompting techniques on the security of code generated fromNL instructions by LLMs. METHOD: First we perform a systematic literaturereview to identify the existing prompting techniques that can be used for codegeneration tasks. A subset of these techniques are evaluated on GPT-3, GPT-3.5,and GPT-4 models for secure code generation. For this, we used an existingdataset consisting of 150 NL security-relevant code-generation prompts.RESULTS: Our work (i) classifies potential prompting techniques for codegeneration (ii) adapts and evaluates a subset of the identified techniques forsecure code generation tasks and (iii) observes a reduction in securityweaknesses across the tested LLMs, especially after using an existing techniquecalled Recursive Criticism and Improvement (RCI), contributing valuableinsights to the ongoing discourse on LLM-generated code security.

Quick Read (beta)

loading the full paper ...