Abstract
Dance is an important human art form, but creating new dances can bedifficult and time-consuming. In this work, we introduce Editable DanceGEneration (EDGE), a state-of-the-art method for editable dance generation thatis capable of creating realistic, physically-plausible dances while remainingfaithful to the input music. EDGE uses a transformer-based diffusion modelpaired with Jukebox, a strong music feature extractor, and confers powerfulediting capabilities well-suited to dance, including joint-wise conditioning,and in-betweening. We introduce a new metric for physical plausibility, andevaluate dance quality generated by our method extensively through (1) multiplequantitative metrics on physical plausibility, beat alignment, and diversitybenchmarks, and more importantly, (2) a large-scale user study, demonstrating asignificant improvement over previous state-of-the-art methods. Qualitativesamples from our model can be found at our website.