Continuous diffusion for categorical data

  • 2022-11-28 06:08:54
  • Sander Dieleman, Laurent Sartran, Arman Roshannai, Nikolay Savinov, Yaroslav Ganin, Pierre H. Richemond, Arnaud Doucet, Robin Strudel, Chris Dyer, Conor Durkan, Curtis Hawthorne, RĂ©mi Leblond, Will Grathwohl, Jonas Adler
  • 56

Abstract

Diffusion models have quickly become the go-to paradigm for generativemodelling of perceptual signals (such as images and sound) through iterativerefinement. Their success hinges on the fact that the underlying physicalphenomena are continuous. For inherently discrete and categorical data such aslanguage, various diffusion-inspired alternatives have been proposed. However,the continuous nature of diffusion models conveys many benefits, and in thiswork we endeavour to preserve it. We propose CDCD, a framework for modellingcategorical data with diffusion models that are continuous both in time andinput space. We demonstrate its efficacy on several language modelling tasks.

 

Quick Read (beta)

loading the full paper ...