A Survey on Diffusion Language Models

Abstract

Diffusion Language Models (DLMs) are rapidly emerging as a powerful andpromising alternative to the dominant autoregressive (AR) paradigm. Bygenerating tokens in parallel through an iterative denoising process, DLMspossess inherent advantages in reducing inference latency and capturingbidirectional context, thereby enabling fine-grained control over thegeneration process. While achieving a several-fold speed-up, recentadvancements have allowed DLMs to show performance comparable to theirautoregressive counterparts, making them a compelling choice for variousnatural language processing tasks. In this survey, we provide a holisticoverview of the current DLM landscape. We trace its evolution and relationshipwith other paradigms, such as autoregressive and masked language models, andcover both foundational principles and state-of-the-art models. Our work offersan up-to-date, comprehensive taxonomy and an in-depth analysis of currenttechniques, from pre-training strategies to advanced post-training methods.Another contribution of this survey is a thorough review of DLM inferencestrategies and optimizations, including improvements in decoding parallelism,caching mechanisms, and generation quality. We also highlight the latestapproaches to multimodal extensions of DLMs and delineate their applicationsacross various practical scenarios. Furthermore, our discussion addresses thelimitations and challenges of DLMs, including efficiency, long-sequencehandling, and infrastructure requirements, while outlining future researchdirections to sustain progress in this rapidly evolving field. Project GitHubis available at https://github.com/VILA-Lab/Awesome-DLMs.

Quick Read (beta)

loading the full paper ...