Closing the Generalization Gap in One-Shot Object Detection

Abstract

Despite substantial progress in object detection and few-shot learning,detecting objects based on a single example - one-shot object detection -remains a challenge: trained models exhibit a substantial generalization gap,where object categories used during training are detected much more reliablythan novel ones. Here we show that this generalization gap can be nearly closedby increasing the number of object categories used during training. Our resultsshow that the models switch from memorizing individual categories to learningobject similarity over the category distribution, enabling stronggeneralization at test time. Importantly, in this regime standard methods toimprove object detection models like stronger backbones or longer trainingschedules also benefit novel categories, which was not the case for smallerdatasets like COCO. Our results suggest that the key to strong few-shotdetection models may not lie in sophisticated metric learning approaches, butinstead in scaling the number of categories. Future data annotation effortsshould therefore focus on wider datasets and annotate a larger number ofcategories rather than gathering more images or instances per category.

Quick Read (beta)

loading the full paper ...