Abstract
Modern 3D generation methods can rapidly create shapes from sparse or singleviews, but their outputs often lack geometric detail due to computationalconstraints. We present DetailGen3D, a generative approach specificallydesigned to enhance these generated 3D shapes. Our key insight is to model thecoarse-to-fine transformation directly through data-dependent flows in latentspace, avoiding the computational overhead of large-scale 3D generative models.We introduce a token matching strategy that ensures accurate spatialcorrespondence during refinement, enabling local detail synthesis whilepreserving global structure. By carefully designing our training data to matchthe characteristics of synthesized coarse shapes, our method can effectivelyenhance shapes produced by various 3D generation and reconstruction approaches,from single-view to sparse multi-view inputs. Extensive experiments demonstratethat DetailGen3D achieves high-fidelity geometric detail synthesis whilemaintaining efficiency in training.