Abstract
We introduce a new image segmentation task, termed Entity Segmentation (ES)with the aim to segment all visual entities in an image without consideringsemantic category labels. It has many practical applications in imagemanipulation/editing where the segmentation mask quality is typically crucialbut category labels are less important. In this setting, allsemantically-meaningful segments are equally treated as categoryless entitiesand there is no thing-stuff distinction. Based on our unified entityrepresentation, we propose a center-based entity segmentation framework withtwo novel modules to improve mask quality. Experimentally, both our new taskand framework demonstrate superior advantages as against existing work. Inparticular, ES enables the following: (1) merging multiple datasets to form alarge training set without the need to resolve label conflicts; (2) any modeltrained on one dataset can generalize exceptionally well to other datasets withunseen domains. Our code is made publicly available athttps://github.com/dvlab-research/Entity.