The framework of variational autoencoders allows us to efficiently learn deeplatent-variable models, such that the model's marginal distribution overobserved variables fits the data. Often, we're interested in going a stepfurther, and want to approximate the true joint distribution over observed andlatent variables, including the true prior and posterior distributions overlatent variables. This is known to be generally impossible due tounidentifiability of the model. We address this issue by showing that for abroad family of deep latent-variable models, identification of the true jointdistribution over observed and latent variables is actually possible up to asimple transformation, thus achieving a principled and powerful form ofdisentanglement. Our result requires a factorized prior distribution over thelatent variables that is conditioned on an additionally observed variable, suchas a class label or almost any other observation. We build on recentdevelopments in nonlinear ICA, which we extend to the case with noisy,undercomplete or discrete observations, integrated in a maximum likelihoodframework. The result also trivially contains identifiable flow-basedgenerative models as a special case.