ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

Abstract

Autonomous driving has attracted remarkable attention from both industry andacademia. An important task is to estimate 3D properties(e.g.translation,rotation and shape) of a moving or parked vehicle on the road. This task, whilecritical, is still under-researched in the computer vision community -partially owing to the lack of large scale and fully-annotated 3D car databasesuitable for autonomous driving research. In this paper, we contribute thefirst large-scale database suitable for 3D car instance understanding -ApolloCar3D. The dataset contains 5,277 driving images and over 60K carinstances, where each car is fitted with an industry-grade 3D CAD model withabsolute model size and semantically labelled keypoints. This dataset is above20 times larger than PASCAL3D+ and KITTI, the current state-of-the-art. Toenable efficient labelling in 3D, we build a pipeline by considering 2D-3Dkeypoint correspondences for a single instance and 3D relationship amongmultiple instances. Equipped with such dataset, we build various baselinealgorithms with the state-of-the-art deep convolutional neural networks.Specifically, we first segment each car with a pre-trained Mask R-CNN, and thenregress towards its 3D pose and shape based on a deformable 3D car model withor without using semantic keypoints. We show that using keypoints significantlyimproves fitting performance. Finally, we develop a new 3D metric jointlyconsidering 3D pose and 3D shape, allowing for comprehensive evaluation andablation study. By comparing with human performance we suggest several futuredirections for further improvements.

Quick Read (beta)

loading the full paper ...