Learning to Train with Synthetic Humans

Abstract

Neural networks need big annotated datasets for training. However, manualannotation can be too expensive or even unfeasible for certain tasks, likemulti-person 2D pose estimation with severe occlusions. A remedy for this issynthetic data with perfect ground truth. Here we explore two variations ofsynthetic data for this challenging problem; a dataset with purely synthetichumans and a real dataset augmented with synthetic humans. We then study whichapproach better generalizes to real data, as well as the influence of virtualhumans in the training loss. Using the augmented dataset, without consideringsynthetic humans in the loss, leads to the best results. We observe that notall synthetic samples are equally informative for training, while theinformative samples are different for each training stage. To exploit thisobservation, we employ an adversarial student-teacher framework; the teacherimproves the student by providing the hardest samples for its current state asa challenge. Experiments show that the student-teacher framework outperformsnormal training on the purely synthetic dataset.

Quick Read (beta)

loading the full paper ...