MvDrag3D: Drag-based Creative 3D Editing via Multi-view Generation-Reconstruction Priors

Abstract

Drag-based editing has become popular in 2D content creation, driven by thecapabilities of image generative models. However, extending this technique to3D remains a challenge. Existing 3D drag-based editing methods, whetheremploying explicit spatial transformations or relying on implicit latentoptimization within limited-capacity 3D generative models, fall short inhandling significant topology changes or generating new textures across diverseobject categories. To overcome these limitations, we introduce MVDrag3D, anovel framework for more flexible and creative drag-based 3D editing thatleverages multi-view generation and reconstruction priors. At the core of ourapproach is the usage of a multi-view diffusion model as a strong generativeprior to perform consistent drag editing over multiple rendered views, which isfollowed by a reconstruction model that reconstructs 3D Gaussians of the editedobject. While the initial 3D Gaussians may suffer from misalignment betweendifferent views, we address this via view-specific deformation networks thatadjust the position of Gaussians to be well aligned. In addition, we propose amulti-view score function that distills generative priors from multiple viewsto further enhance the view consistency and visual quality. Extensiveexperiments demonstrate that MVDrag3D provides a precise, generative, andflexible solution for 3D drag-based editing, supporting more versatile editingeffects across various object categories and 3D representations.

Quick Read (beta)

loading the full paper ...