Adaptive Nonparametric Variational Autoencoder

Abstract

Clustering is used to find structure in unlabeled data by grouping similarobjects together. Cluster analysis depends on the definition of similarity inthe feature space. In this paper, we propose an Adaptive NonparametricVariational Autoencoder (AdapVAE) to perform end-to-end feature learning fromraw data jointly with cluster membership learning through a NonparametricBayesian modeling framework with deep neural networks. It has the advantage ofavoiding pre-definition of similarity or feature engineering. Our model relaxesthe constraint of fixing the number of clusters in advance by assigning aDirichlet Process prior on the latent representation in a low-dimensionalfeature space. It can adaptively detect novel clusters when new data arrivesbased on a learned model from historical data in an online unsupervisedlearning setting. We develop a joint online variational inference algorithm tolearn feature representations and cluster assignments via iterativelyoptimizing the evidence lower bound (ELBO). Our experimental resultsdemonstrate the capacity of our modelling framework to learn the number ofclusters automatically using data, the flexibility to detect novel clusterswith emerging data adaptively, the ability of high quality reconstruction andgeneration of samples without supervised information and the improvement overstate-of-the-art end-to-end clustering methods in terms of accuracy on bothimage and text corpora benchmark datasets.

Quick Read (beta)

loading the full paper ...