OW-DETR: Open-world Detection Transformer

  • 2021-12-02 18:58:30
  • Akshita Gupta, Sanath Narayan, K J Joseph, Salman Khan, Fahad Shahbaz Khan, Mubarak Shah
  


Open-world object detection (OWOD) is a challenging computer vision problem,where the task is to detect a known set of object categories whilesimultaneously identifying unknown objects. Additionally, the model mustincrementally learn new classes that become known in the next trainingepisodes. Distinct from standard object detection, the OWOD setting posessignificant challenges for generating quality candidate proposals onpotentially unknown objects, separating the unknown objects from the backgroundand detecting diverse unknown objects. Here, we introduce a novel end-to-endtransformer-based framework, OW-DETR, for open-world object detection. Theproposed OW-DETR comprises three dedicated components namely, attention-drivenpseudo-labeling, novelty classification and objectness scoring to explicitlyaddress the aforementioned OWOD challenges. Our OW-DETR explicitly encodesmulti-scale contextual information, possesses less inductive bias, enablesknowledge transfer from known classes to the unknown class and can betterdiscriminate between unknown objects and background. Comprehensive experimentsare performed on two benchmarks: MS-COCO and PASCAL VOC. The extensiveablations reveal the merits of our proposed contributions. Further, our modeloutperforms the recently introduced OWOD approach, ORE, with absolute gainsranging from 1.8% to 3.3% in terms of unknown recall on the MS-COCO benchmark.In the case of incremental object detection, OW-DETR outperforms thestate-of-the-art for all settings on the PASCAL VOC benchmark. Our codes andmodels will be publicly released.


