Auxiliary Tasks in Multi-task Learning

Abstract

Multi-task convolutional neural networks (CNNs) have shown impressive resultsfor certain combinations of tasks, such as single-image depth estimation (SIDE)and semantic segmentation. This is achieved by pushing the network towardslearning a robust representation that generalizes well to different atomictasks. We extend this concept by adding auxiliary tasks, which are of minorrelevance for the application, to the set of learned tasks. As a kind ofadditional regularization, they are expected to boost the performance of theultimately desired main tasks. To study the proposed approach, we pickedvision-based road scene understanding (RSU) as an exemplary application. Sincemulti-task learning requires specialized datasets, particularly when usingextensive sets of tasks, we provide a multi-modal dataset for multi-task RSU,called synMT. More than 2.5 $\cdot$ 10^5 synthetic images, annotated with 21different labels, were acquired from the video game Grand Theft Auto V (GTA V).Our proposed deep multi-task CNN architecture was trained on variouscombination of tasks using synMT. The experiments confirmed that auxiliarytasks can indeed boost network performance, both in terms of final results andtraining time.

Quick Read (beta)

loading the full paper ...