Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks

Abstract

We present and evaluate a method called grammar masking, which is used toguide large language models (LLMs) toward producing syntactically correctmodels for a given context-free grammar. Prompt engineering methods such asfew-shot learning or priming can be used to improve the chances of an LLMproducing correct syntax, but the more complex the grammar, the moretime-consuming and less promising these methods become. Previous work isfocused primarily on the usage of either language model training or promptengineering. In this work, a method is presented that restricts the output to agiven grammar using constrained decoding to ensure the output adheres to avalid syntax. We use several DSLs built with MontiCore and task multiple LLMsto produce models with and without constrained decoding. A corresponding parseris used to confirm the syntactic correctness of each model. We show thatgrammar masking can dramatically improve the modeling capabilities of severalLLMs, reducing the need for well-refined prompting while increasing the chanceof producing correct models.

Quick Read (beta)

loading the full paper ...