The LORACs prior for VAEs: Letting the Trees Speak for the Data

Abstract

In variational autoencoders, the prior on the latent codes $z$ is oftentreated as an afterthought, but the prior shapes the kind of latentrepresentation that the model learns. If the goal is to learn a representationthat is interpretable and useful, then the prior should reflect the ways inwhich the high-level factors that describe the data vary. The "default" prioris an isotropic normal, but if the natural factors of variation in the datasetexhibit discrete structure or are not independent, then the isotropic-normalprior will actually encourage learning representations that mask thisstructure. To alleviate this problem, we propose using a flexible Bayesiannonparametric hierarchical clustering prior based on the time-marginalizedcoalescent (TMC). To scale learning to large datasets, we develop a newinducing-point approximation and inference algorithm. We then apply the methodwithout supervision to several datasets and examine the interpretability andpractical performance of the inferred hierarchies and learned latent space.

Quick Read (beta)

loading the full paper ...