Attentive Neural Processes

  • 2019-01-17 12:37:26
  • Hyunjik Kim, Andriy Mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh
  • 321

Abstract

Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression bylearning to map a context set of observed input-output pairs to a distributionover regression functions. Each function models the distribution of the outputgiven an input, conditioned on the context. NPs have the benefit of fittingobserved data efficiently with linear complexity in the number of contextinput-output pairs, and can learn a wide family of conditional distributions;they learn predictive distributions conditioned on context sets of arbitrarysize. Nonetheless, we show that NPs suffer a fundamental drawback ofunderfitting, giving inaccurate predictions at the inputs of the observed datathey condition on. We address this issue by incorporating attention into NPs,allowing each input location to attend to the relevant context points for theprediction. We show that this greatly improves the accuracy of predictions,results in noticeably faster training, and expands the range of functions thatcan be modelled.

 

Quick Read (beta)

loading the full paper ...