The ApolloScape Dataset for Autonomous Driving

Abstract

Scene parsing aims to assign a class (semantic) label for each pixel in animage. It is a comprehensive analysis of an image. Given the rise of autonomousdriving, pixel-accurate environmental perception is expected to be a keyenabling technical piece. However, providing a large scale dataset for thedesign and evaluation of scene parsing algorithms, in particular for outdoorscenes, has been difficult. The per-pixel labelling process is prohibitivelyexpensive, limiting the scale of existing ones. In this paper, we present alarge-scale open dataset, ApolloScape, that consists of RGB videos andcorresponding dense 3D point clouds. Comparing with existing datasets, ourdataset has the following unique properties. The first is its scale, ourinitial release contains over 140K images - each with its per-pixel semanticmask, up to 1M is scheduled. The second is its complexity. Captured in varioustraffic conditions, the number of moving objects averages from tens to over onehundred. And the third is the 3D attribute, each image is tagged withhigh-accuracy pose information at cm accuracy and the static background pointcloud has mm relative accuracy. We are able to label these many images by aninteractive and efficient labelling pipeline that utilizes the high-quality 3Dpoint cloud. Moreover, our dataset also contains different lane markings basedon the lane colors and styles. We expect our new dataset can deeply benefitvarious autonomous driving related applications that include but not limited to2D/3D scene understanding, localization, transfer learning, and drivingsimulation.

Quick Read (beta)

loading the full paper ...