PandaNet : Anchor-Based Single-Shot Multi-Person 3D Pose Estimation

  • 2021-01-07 10:32:17
  • Abdallah Benzine, Florian Chabot, Bertrand Luvison, Quoc Cong Pham, Cahterine Achrd
Recently, several deep learning models have been proposed for 3D human poseestimation. Nevertheless, most of these approaches only focus on thesingle-person case or estimate 3D pose of a few people at high resolution.Furthermore, many applications such as autonomous driving or crowd analysisrequire pose estimation of a large number of people possibly at low-resolution.In this work, we present PandaNet (Pose estimAtioN and Dectection Anchor-basedNetwork), a new single-shot, anchor-based and multi-person 3D pose estimationapproach. The proposed model performs bounding box detection and, for eachdetected person, 2D and 3D pose regression into a single forward pass. It doesnot need any post-processing to regroup joints since the network predicts afull 3D pose for each bounding box and allows the pose estimation of a possiblylarge number of people at low resolution. To manage people overlapping, weintroduce a Pose-Aware Anchor Selection strategy. Moreover, as imbalance existsbetween different people sizes in the image, and joints coordinates havedifferent uncertainties depending on these sizes, we propose a method toautomatically optimize weights associated to different people scales and jointsfor efficient training. PandaNet surpasses previous single-shot methods onseveral challenging datasets: a multi-person urban virtual but very realisticdataset (JTA Dataset), and two real world 3D multi-person datasets (CMUPanoptic and MuPoTS-3D).


