Fast Single-Class Classification and the Principle of Logit Separation

Abstract

We consider neural network training, in applications in which there are manypossible classes, but at test-time, the task is a binary classification task ofdetermining whether the given example belongs to a specific class, where theclass of interest can be different each time the classifier is applied. Forinstance, this is the case for real-time image search. We define the SingleLogit Classification (SLC) task: training the network so that at test-time, itwould be possible to accurately identify whether the example belongs to a givenclass in a computationally efficient manner, based only on the output logit forthis class. We propose a natural principle, the Principle of Logit Separation,as a guideline for choosing and designing losses suitable for the SLC. We showthat the cross-entropy loss function is not aligned with the Principle of LogitSeparation. In contrast, there are known loss functions, as well as novel batchloss functions that we propose, which are aligned with this principle. Intotal, we study seven loss functions. Our experiments show that indeed inalmost all cases, losses that are aligned with the Principle of LogitSeparation obtain at least 20% relative accuracy improvement in the SLC taskcompared to losses that are not aligned with it, and sometimes considerablymore. Furthermore, we show that fast SLC does not cause any drop in binaryclassification accuracy, compared to standard classification in which alllogits are computed, and yields a speedup which grows with the number ofclasses. For instance, we demonstrate a 10x speedup when the number of classesis 400,000. Tensorflow code for optimizing the new batch losses is publiclyavailable at https://github.com/cruvadom/Logit Separation.

Quick Read (beta)

loading the full paper ...