Bi-Directional Domain Translation for Zero-Shot Sketch-Based Image Retrieval

Abstract

The goal of Sketch-Based Image Retrieval (SBIR) is using free-hand sketchesto retrieve images of the same category from a natural image gallery. However,SBIR requires all categories to be seen during training, which cannot beguaranteed in real-world applications. So we investigate more challengingZero-Shot SBIR (ZS-SBIR), in which test categories do not appear in thetraining stage. Traditional SBIR methods are prone to be category-basedretrieval and cannot generalize well from seen categories to unseen ones. Incontrast, we disentangle image features into structure features and appearancefeatures to facilitate structure-based retrieval. To assist featuredisentanglement and take full advantage of disentangled information, we proposea Bi-directional Domain Translation (BDT) framework for ZS-SBIR, in which theimage domain and sketch domain can be translated to each other throughdisentangled structure and appearance features. Finally, we perform retrievalin both structure feature space and image feature space. Extensive experimentsdemonstrate that our proposed approach remarkably outperforms state-of-the-artapproaches by about 8% on the Sketchy dataset and over 5% on the TU-Berlindataset.

Quick Read (beta)

loading the full paper ...