StyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians

  • 2025-04-21 18:59:55
  • Cailin Zhuang, Yaoqi Hu, Xuanyang Zhang, Wei Cheng, Jiacheng Bao, Shengqi Liu, Yiying Yang, Xianfang Zeng, Gang Yu, Ming Li
  • 0

Abstract

3D Gaussian Splatting (3DGS) excels in photorealistic scene reconstructionbut struggles with stylized scenarios (e.g., cartoons, games) due to fragmentedtextures, semantic misalignment, and limited adaptability to abstractaesthetics. We propose StyleMe3D, a holistic framework for 3D GS style transferthat integrates multi-modal style conditioning, multi-level semantic alignment,and perceptual quality enhancement. Our key insights include: (1) optimizingonly RGB attributes preserves geometric integrity during stylization; (2)disentangling low-, medium-, and high-level semantics is critical for coherentstyle transfer; (3) scalability across isolated objects and complex scenes isessential for practical deployment. StyleMe3D introduces four novel components:Dynamic Style Score Distillation (DSSD), leveraging Stable Diffusion's latentspace for semantic alignment; Contrastive Style Descriptor (CSD) for localized,content-aware texture transfer; Simultaneously Optimized Scale (SOS) todecouple style details and structural coherence; and 3D Gaussian QualityAssessment (3DG-QA), a differentiable aesthetic prior trained on human-rateddata to suppress artifacts and enhance visual harmony. Evaluated on NeRFsynthetic dataset (objects) and tandt db (scenes) datasets, StyleMe3Doutperforms state-of-the-art methods in preserving geometric details (e.g.,carvings on sculptures) and ensuring stylistic consistency across scenes (e.g.,coherent lighting in landscapes), while maintaining real-time rendering. Thiswork bridges photorealistic 3D GS and artistic stylization, unlockingapplications in gaming, virtual worlds, and digital art.

 

Quick Read (beta)

loading the full paper ...