Abstract
Self-training allows a network to learn from the predictions of a morecomplicated model, thus often requires well-trained teacher models and mixtureof teacher-student data while multi-task learning jointly optimizes differenttargets to learn salient interrelationship and requires multi-task annotationsfor each training example. These frameworks, despite being particularly datademanding have potentials for data exploitation if such assumptions can berelaxed. In this paper, we compare self-training object detection under thedeficiency of teacher training data where students are trained on unseenexamples by the teacher, and multi-task learning with partially annotated data,i.e. single-task annotation per training example. Both scenarios have their ownlimitation but potentially helpful with limited annotated data. Experimentalresults show the improvement of performance when using a weak teacher withunseen data for training a multi-task student. Despite the limited setup webelieve the experimental results show the potential of multi-task knowledgedistillation and self-training, which could be beneficial for future study.Source code is at https://lhoangan.github.io/multas.