On the Content Bias in Fréchet Video Distance

  • 2024-04-18 18:59:58
  • Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar, Jun-Yan Zhu, Jia-Bin Huang
  • 0

Abstract

Fr\'echet Video Distance (FVD), a prominent metric for evaluating videogeneration models, is known to conflict with human perception occasionally. Inthis paper, we aim to explore the extent of FVD's bias toward per-frame qualityover temporal realism and identify its sources. We first quantify the FVD'ssensitivity to the temporal axis by decoupling the frame and motion quality andfind that the FVD increases only slightly with large temporal corruption. Wethen analyze the generated videos and show that via careful sampling from alarge set of generated videos that do not contain motions, one can drasticallydecrease FVD without improving the temporal quality. Both studies suggest FVD'sbias towards the quality of individual frames. We further observe that the biascan be attributed to the features extracted from a supervised video classifiertrained on the content-biased dataset. We show that FVD with features extractedfrom the recent large-scale self-supervised video models is less biased towardimage quality. Finally, we revisit a few real-world examples to validate ourhypothesis.

 

Quick Read (beta)

loading the full paper ...