Abstract
Panoptic segmentation is an important computer vision task which combinessemantic and instance segmentation. It plays a crucial role in domains ofmedical image analysis, self-driving vehicles, and robotics by providing acomprehensive understanding of visual environments. Traditionally, deeplearning panoptic segmentation models have relied on dense and accuratelyannotated training data, which is expensive and time consuming to obtain.Recent advancements in self-supervised learning approaches have shown greatpotential in leveraging synthetic and unlabelled data to generate pseudo-labelsusing self-training to improve the performance of instance and semanticsegmentation models. The three available methods for self-supervised panopticsegmentation use proposal-based transformer architectures which arecomputationally expensive, complicated and engineered for specific tasks. Theaim of this work is to develop a framework to perform embedding-basedself-supervised panoptic segmentation using self-training in asynthetic-to-real domain adaptation problem setting.