Whole-Body Human Pose Estimation in the Wild

Abstract

This paper investigates the task of 2D human whole-body pose estimation,which aims to localize dense landmarks on the entire human body including face,hands, body, and feet. As existing datasets do not have whole-body annotations,previous methods have to assemble different deep models trained independentlyon different datasets of the human face, hand, and body, struggling withdataset biases and large model complexity. To fill in this blank, we introduceCOCO-WholeBody which extends COCO dataset with whole-body annotations. To ourbest knowledge, it is the first benchmark that has manual annotations on theentire human body, including 133 dense landmarks with 68 on the face, 42 onhands and 23 on the body and feet. A single-network model, named ZoomNet, isdevised to take into account the hierarchical structure of the full human bodyto solve the scale variation of different body parts of the same person.ZoomNet is able to significantly outperform existing methods on the proposedCOCO-WholeBody dataset. Extensive experiments show that COCO-WholeBody not onlycan be used to train deep models from scratch for whole-body pose estimationbut also can serve as a powerful pre-training dataset for many different taskssuch as facial landmark detection and hand keypoint estimation. The dataset ispublicly available at https://github.com/jin-s13/COCO-WholeBody.

Quick Read (beta)

loading the full paper ...