Clouds of Oriented Gradients for 3D Detection of Objects, Surfaces, and Indoor Scene Layouts

Abstract

We develop new representations and algorithms for three-dimensional (3D)object detection and spatial layout prediction in cluttered indoor scenes. Wefirst propose a clouds of oriented gradient (COG) descriptor that links the 2Dappearance and 3D pose of object categories, and thus accurately models howperspective projection affects perceived image gradients. To better representthe 3D visual styles of large objects and provide contextual cues to improvethe detection of small objects, we introduce latent support surfaces. We thenpropose a "Manhattan voxel" representation which better captures the 3D roomlayout geometry of common indoor environments. Effective classification rulesare learned via a latent structured prediction framework. Contextualrelationships among categories and layout are captured via a cascade ofclassifiers, leading to holistic scene hypotheses that exceed thestate-of-the-art on the SUN RGB-D database.

Quick Read (beta)

loading the full paper ...