Locally Masked Convolution for Autoregressive Models

Abstract

High-dimensional generative models have many applications including imagecompression, multimedia generation, anomaly detection and data completion.State-of-the-art estimators for natural images are autoregressive, decomposingthe joint distribution over pixels into a product of conditionals parameterizedby a deep neural network, e.g. a convolutional neural network such as thePixelCNN. However, PixelCNNs only model a single decomposition of the joint,and only a single generation order is efficient. For tasks such as imagecompletion, these models are unable to use much of the observed context. Togenerate data in arbitrary orders, we introduce LMConv: a simple modificationto the standard 2D convolution that allows arbitrary masks to be applied to theweights at each location in the image. Using LMConv, we learn an ensemble ofdistribution estimators that share parameters but differ in generation order,achieving improved performance on whole-image density estimation (2.89 bpd onunconditional CIFAR10), as well as globally coherent image completions. Ourcode is available at https://ajayjain.github.io/lmconv.

Quick Read (beta)

loading the full paper ...