Abstract
Semantic parsing is a technique aimed at constructing a structuredrepresentation of the meaning of a natural-language question. Recentadvancements in few-shot language models trained on code have demonstratedsuperior performance in generating these representations compared totraditional unimodal language models, which are trained on downstream tasks.Despite these advancements, existing fine-tuned neural semantic parsers aresusceptible to adversarial attacks on natural-language inputs. While it hasbeen established that the robustness of smaller semantic parsers can beenhanced through adversarial training, this approach is not feasible for largelanguage models in real-world scenarios, as it requires both substantialcomputational resources and expensive human annotation on in-domain semanticparsing data. This paper presents the first empirical study on the adversarialrobustness of a large prompt-based language model of code, \codex. Our resultsdemonstrate that the state-of-the-art (SOTA) code-language models arevulnerable to carefully crafted adversarial examples. To address thischallenge, we propose methods for improving robustness without the need forsignificant amounts of labeled data or heavy computational resources.