Digi2Real: Bridging the Realism Gap in Synthetic Data Face Recognition via Foundation Models

Abstract

The accuracy of face recognition systems has improved significantly in thepast few years, thanks to the large amount of data collected and theadvancement in neural network architectures. However, these large-scaledatasets are often collected without explicit consent, raising ethical andprivacy concerns. To address this, there have been proposals to use syntheticdatasets for training face recognition models. Yet, such models still rely onreal data to train the generative models and generally exhibit inferiorperformance compared to those trained on real datasets. One of these datasets,DigiFace, uses a graphics pipeline to generate different identities anddifferent intra-class variations without using real data in training themodels. However, the performance of this approach is poor on face recognitionbenchmarks, possibly due to the lack of realism in the images generated fromthe graphics pipeline. In this work, we introduce a novel framework for realismtransfer aimed at enhancing the realism of synthetically generated face images.Our method leverages the large-scale face foundation model, and we adapt thepipeline for realism enhancement. By integrating the controllable aspects ofthe graphics pipeline with our realism enhancement technique, we generate alarge amount of realistic variations-combining the advantages of bothapproaches. Our empirical evaluations demonstrate that models trained using ourenhanced dataset significantly improve the performance of face recognitionsystems over the baseline. The source code and datasets will be made availablepublicly.

Quick Read (beta)

loading the full paper ...