Demystifying MMD GANs - Paper Detail

Abstract

We investigate the training and performance of generative adversarialnetworks using the Maximum Mean Discrepancy (MMD) as critic, termed MMD GANs.As our main theoretical contribution, we clarify the situation with bias in GANloss functions raised by recent work: we show that gradient estimators used inthe optimization process for both MMD GANs and Wasserstein GANs are unbiased,but learning a discriminator based on samples leads to biased gradients for thegenerator parameters. We also discuss the issue of kernel choice for the MMDcritic, and characterize the kernel corresponding to the energy distance usedfor the Cramer GAN critic. Being an integral probability metric, the MMDbenefits from training strategies recently developed for Wasserstein GANs. Inexperiments, the MMD GAN is able to employ a smaller critic network than theWasserstein GAN, resulting in a simpler and faster-training algorithm withmatching performance. We also propose an improved measure of GAN convergence,the Kernel Inception Distance, and show how to use it to dynamically adaptlearning rates during GAN training.

Quick Read (beta)

loading the full paper ...