Grade Inflation in Generative Models

  • 2025-01-22 21:15:18
  • Phuc Nguyen, Miao Li, Alexandra Morgan, Rima Arnaout, Ramy Arnaout
  • 0


Generative models hold great potential, but only if one can trust theevaluation of the data they generate. We show that many commonly used qualityscores for comparing two-dimensional distributions of synthetic vs.ground-truth data give better results than they should, a phenomenon we callthe "grade inflation problem." We show that the correlation score, Jaccardscore, earth-mover's score, and Kullback-Leibler (relative-entropy) score allsuffer grade inflation. We propose that any score that values all datapointsequally, as these do, will also exhibit grade inflation; we refer to suchscores as "equipoint" scores. We introduce the concept of "equidensity" scores,and present the Eden score, to our knowledge the first example of such a score.We found that Eden avoids grade inflation and agrees better with humanperception of goodness-of-fit than the equipoint scores above. We propose thatany reasonable equidensity score will avoid grade inflation. We identify aconnection between equidensity scores and R\'enyi entropy of negative order. Weconclude that equidensity scores are likely to outperform equipoint scores forgenerative models, and for comparing low-dimensional distributions moregenerally.


