MIM: Mutual Information Machine

Abstract

We introduce the Mutual Information Machine (MIM), an autoencoder model forlearning joint distributions over observations and latent states. The modelformulation reflects two key design principles: 1) symmetry, to encourage theencoder and decoder to learn consistent factorizations of the same underlyingdistribution; and 2) mutual information, to encourage the learning of usefulrepresentations for downstream tasks. The objective comprises theJensen-Shannon divergence between the encoding and decoding jointdistributions, plus a mutual information term. We show that this objective canbe bounded by a tractable cross-entropy loss between the true model and aparameterized approximation, and relate this to maximum likelihood estimationand variational autoencoders. Experiments show that MIM is capable of learninga latent representation with high mutual information, and good unsupervisedclustering, while providing data log likelihood comparable to VAE (with asufficiently expressive architecture).

Quick Read (beta)

loading the full paper ...