SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

  • 2024-04-18 18:59:46
  • Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang
  • 0

Abstract

Open-world entity segmentation, as an emerging computer vision task, aims atsegmenting entities in images without being restricted by pre-defined classes,offering impressive generalization capabilities on unseen images and concepts.Despite its promise, existing entity segmentation methods like Segment AnythingModel (SAM) rely heavily on costly expert annotators. This work presentsSelf-supervised Open-world Hierarchical Entity Segmentation (SOHES), a novelapproach that eliminates the need for human annotations. SOHES operates inthree phases: self-exploration, self-instruction, and self-correction. Given apre-trained self-supervised representation, we produce abundant high-qualitypseudo-labels through visual feature clustering. Then, we train a segmentationmodel on the pseudo-labels, and rectify the noises in pseudo-labels via ateacher-student mutual-learning procedure. Beyond segmenting entities, SOHESalso captures their constituent parts, providing a hierarchical understandingof visual entities. Using raw images as the sole training data, our methodachieves unprecedented performance in self-supervised open-world segmentation,marking a significant milestone towards high-quality open-world entitysegmentation in the absence of human-annotated masks. Project page:https://SOHES.github.io.

 

Quick Read (beta)

loading the full paper ...