VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

  • 2024-11-20 17:54:41
  • Ziqi Huang, Fan Zhang, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
  • 0

Abstract

Video generation has witnessed significant advancements, yet evaluating thesemodels remains a challenge. A comprehensive evaluation benchmark for videogeneration is indispensable for two reasons: 1) Existing metrics do not fullyalign with human perceptions; 2) An ideal evaluation system should provideinsights to inform future developments of video generation. To this end, wepresent VBench, a comprehensive benchmark suite that dissects "video generationquality" into specific, hierarchical, and disentangled dimensions, each withtailored prompts and evaluation methods. VBench has several appealingproperties: 1) Comprehensive Dimensions: VBench comprises 16 dimensions invideo generation (e.g., subject identity inconsistency, motion smoothness,temporal flickering, and spatial relationship, etc). The evaluation metricswith fine-grained levels reveal individual models' strengths and weaknesses. 2)Human Alignment: We also provide a dataset of human preference annotations tovalidate our benchmarks' alignment with human perception, for each evaluationdimension respectively. 3) Valuable Insights: We look into current models'ability across various evaluation dimensions, and various content types. Wealso investigate the gaps between video and image generation models. 4)Versatile Benchmarking: VBench++ supports evaluating text-to-video andimage-to-video. We introduce a high-quality Image Suite with an adaptive aspectratio to enable fair evaluations across different image-to-video generationsettings. Beyond assessing technical quality, VBench++ evaluates thetrustworthiness of video generative models, providing a more holistic view ofmodel performance. 5) Full Open-Sourcing: We fully open-source VBench++ andcontinually add new video generation models to our leaderboard to drive forwardthe field of video generation.

 

Quick Read (beta)

loading the full paper ...