Abstract
Energy-based models (EBMs) are powerful probabilistic models, but suffer fromintractable sampling and density evaluation due to the partition function. As aresult, inference in EBMs relies on approximate sampling algorithms, leading toa mismatch between the model and inference. Motivated by this, we consider thesampler-induced distribution as the model of interest and maximize thelikelihood of this model. This yields a class of energy-inspired models (EIMs)that incorporate learned energy functions while still providing exact samplesand tractable log-likelihood lower bounds. We describe and evaluate threeinstantiations of such models based on truncated rejection sampling,self-normalized importance sampling, and Hamiltonian importance sampling. Thesemodels outperform or perform comparably to the recently proposed LearnedAccept/Reject Sampling algorithm and provide new insights on ranking NoiseContrastive Estimation and Contrastive Predictive Coding. Moreover, EIMs allowus to generalize a recent connection between multi-sample variational lowerbounds and auxiliary variable variational inference. We show how recentvariational bounds can be unified with EIMs as the variational family.