Joint 3D Proposal Generation and Object Detection from View Aggregation

Abstract

We present AVOD, an Aggregate View Object Detection network for autonomousdriving scenarios. The proposed neural network architecture uses LIDAR pointclouds and RGB images to generate features that are shared by two subnetworks:a region proposal network (RPN) and a second stage detector network. Theproposed RPN uses a novel architecture capable of performing multimodal featurefusion to generate reliable 3D object proposals for multiple object classes inroad scenes. Using these proposals, the second stage detection network performsaccurate oriented 3D bounding box regression and category classification topredict the extents, orientation, and classification of objects in 3D space.Our proposed architecture is shown to produces state of the art results on theKITTI 3D object detection benchmark while running in real time with a lowmemory footprint, making it a suitable candidate for deployment on autonomousvehicles.

Quick Read (beta)

loading the full paper ...