Abstract
Amodal Instance Segmentation (AIS) presents a challenging task as it involvespredicting both visible and occluded parts of objects within images. ExistingAIS methods rely on a bidirectional approach, encompassing both the transitionfrom amodal features to visible features (amodal-to-visible) and from visiblefeatures to amodal features (visible-to-amodal). Our observation shows that theutilization of amodal features through the amodal-to-visible can confuse thevisible features due to the extra information of occluded/hidden segments notpresented in visible display. Consequently, this compromised quality of visiblefeatures during the subsequent visible-to-amodal transition. To tackle thisissue, we introduce ShapeFormer, a decoupled Transformer-based model with avisible-to-amodal transition. It facilitates the explicit relationship betweenoutput segmentations and avoids the need for amodal-to-visible transitions.ShapeFormer comprises three key modules: (i) Visible-Occluding Mask Head forpredicting visible segmentation with occlusion awareness, (ii) Shape-PriorAmodal Mask Head for predicting amodal and occluded masks, and (iii)Category-Specific Shape Prior Retriever aims to provide shape prior knowledge.Comprehensive experiments and extensive ablation studies across various AISbenchmarks demonstrate the effectiveness of our ShapeFormer. The code isavailable at: \url{https://github.com/UARK-AICV/ShapeFormer}