Abstract
Open-vocabulary querying in 3D Gaussian Splatting aims to identifysemantically relevant regions within a 3D Gaussian representation based on agiven text query. Prior work, such as LangSplat, addressed this task byretrieving these regions in the form of segmentation masks on 2D renderings.More recently, OpenGaussian introduced point-level querying, which directlyselects a subset of 3D Gaussians. In this work, we propose a point-levelquerying method that builds upon LangSplat's framework. Our approach improvesthe framework in two key ways: (a) we leverage masklets from the SegmentAnything Model 2 (SAM2) to establish semantic consistent ground-truth fordistilling the language Gaussians; (b) we introduces a novel two-step queryingapproach that first retrieves the distilled ground-truth and subsequently usesthe ground-truth to query the individual Gaussians. Experimental evaluations onthree benchmark datasets demonstrate that the proposed method achieves betterperformance compared to state-of-the-art approaches. For instance, our methodachieves an mIoU improvement of +20.42 on the 3D-OVS dataset.