MonoLayout: Amodal scene layout from a single image

  • 2020-02-19 19:16:34
  • Kaustubh Mani, Swapnil Daga, Shubhika Garg, N. Sai Shankar, Krishna Murthy Jatavallabhula, K. Madhava Krishna
  • 96

Abstract

In this paper, we address the novel, highly challenging problem of estimatingthe layout of a complex urban driving scenario. Given a single color imagecaptured from a driving platform, we aim to predict the bird's-eye view layoutof the road and other traffic participants. The estimated layout should reasonbeyond what is visible in the image, and compensate for the loss of 3Dinformation due to projection. We dub this problem amodal scene layoutestimation, which involves "hallucinating" scene layout for even parts of theworld that are occluded in the image. To this end, we present MonoLayout, adeep neural network for real-time amodal scene layout estimation from a singleimage. We represent scene layout as a multi-channel semantic occupancy grid,and leverage adversarial feature learning to hallucinate plausible completionsfor occluded image parts. Due to the lack of fair baseline methods, we extendseveral state-of-the-art approaches for road-layout estimation and vehicleoccupancy estimation in bird's-eye view to the amodal setup for rigorousevaluation. By leveraging temporal sensor fusion to generate training labels,we significantly outperform current art over a number of datasets. On the KITTIand Argoverse datasets, we outperform all baselines by a significant margin. Wealso make all our annotations, and code publicly available. A video abstract ofthis paper is available https://www.youtube.com/watch?v=HcroGyo6yRQ .

 

Quick Read (beta)

loading the full paper ...