Abstract
We analyse the properties of an unbiased gradient estimator of the ELBO forvariational inference, based on the score function method with leave-one-outcontrol variates. We show that this gradient estimator can be obtained using anew loss, defined as the variance of the log-ratio between the exact posteriorand the variational approximation, which we call the $\textit{log-varianceloss}$. Under certain conditions, the gradient of the log-variance loss equalsthe gradient of the (negative) ELBO. We show theoretically that this gradientestimator, which we call $\textit{VarGrad}$ due to its connection to thelog-variance loss, exhibits lower variance than the score function method incertain settings, and that the leave-one-out control variate coefficients areclose to the optimal ones. We empirically demonstrate that VarGrad offers afavourable variance versus computation trade-off compared to otherstate-of-the-art estimators on a discrete VAE.