The Maximum Mean Discrepancy (MMD) has found numerous applications instatistics and machine learning, most recently as a penalty in the WassersteinAuto-Encoder (WAE). In this paper we compute closed-form expressions forestimating the Gaussian kernel based MMD between a given distribution and thestandard multivariate normal distribution. We introduce the standardizedversion of MMD as a penalty for the WAE training objective, allowing for abetter interpretability of MMD values and more compatibility across differenthyperparameter settings. Next, we propose using a version of batchnormalization at the code layer; this has the benefits of making the kernelwidth selection easier, reducing the training effort, and preventing outliersin the aggregate code distribution. Finally, we discuss the appropriate nulldistributions and provide thresholds for multivariate normality testing withthe standardized MMD, leading to a number of easy rules of thumb for monitoringthe progress of WAE training. Curiously, our MMD formula reveals a connectionto the Baringhaus-Henze-Epps-Pulley (BHEP) statistic of the Henze-Zirkler testand provides further insights about the MMD. Our experiments on synthetic andreal data show that the analytic formulation improves over the commonly usedstochastic approximation of the MMD, and demonstrate that code normalizationprovides significant benefits when training WAEs.