Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction

  • 2024-03-27 18:40:14
  • Qiuhong Shen, Xuanyu Yi, Zike Wu, Pan Zhou, Hanwang Zhang, Shuicheng Yan, Xinchao Wang
  • 0

Abstract

We tackle the challenge of efficiently reconstructing a 3D asset from asingle image with growing demands for automated 3D content creation pipelines.Previous methods primarily rely on Score Distillation Sampling (SDS) and NeuralRadiance Fields (NeRF). Despite their significant success, these approachesencounter practical limitations due to lengthy optimization and considerablememory usage. In this report, we introduce Gamba, an end-to-end amortized 3Dreconstruction model from single-view images, emphasizing two main insights:(1) 3D representation: leveraging a large number of 3D Gaussians for anefficient 3D Gaussian splatting process; (2) Backbone design: introducing aMamba-based sequential network that facilitates context-dependent reasoning andlinear scalability with the sequence (token) length, accommodating asubstantial number of Gaussians. Gamba incorporates significant advancements indata preprocessing, regularization design, and training methodologies. Weassessed Gamba against existing optimization-based and feed-forward 3Dgeneration approaches using the real-world scanned OmniObject3D dataset. Here,Gamba demonstrates competitive generation capabilities, both qualitatively andquantitatively, while achieving remarkable speed, approximately 0.6 second on asingle NVIDIA A100 GPU.

 

Quick Read (beta)

loading the full paper ...