Abstract
LLMs have shown impressive few-shot performance across many tasks. However,they still struggle when it comes to reliably generating complex outputstructures, such as those required for information extraction. This limitationstems from the fact that LLMs, without fine-tuning, tend to generate free textrather than structures precisely following a specific grammar. In this work, wepropose to enrich the decoding with formal grammar constraints. Moreconcretely, given Context-Free Grammar(CFG), our framework ensures that thetoken generated in each decoding step would lead to a valid continuationcompliant with the grammar production rules. This process guarantees thegeneration of valid sequences. Importantly, our framework can be readilycombined with any CFG or decoding algorithm. We demonstrate that the outputs ofmany NLP tasks can be represented as formal languages, making them suitable fordirect use in our framework. We conducted experiments with two challengingtasks involving large alphabets in their grammar (Wikidata entities andrelations): information extraction and entity disambiguation. Our results withLLaMA models indicate that grammar-constrained decoding substantiallyoutperforms unconstrained decoding and even competes with task-specificfine-tuned models. These findings suggest that integrating grammar-basedconstraints during decoding holds great promise in making LLMs reliably producestructured outputs, especially in setting where training data is scarce andfine-tuning is expensive.