Hyperbolic Attention Networks

  • 2018-05-24 17:11:35
  • Caglar Gulcehre, Misha Denil, Mateusz Malinowski, Ali Razavi, Razvan Pascanu, Karl Moritz Hermann, Peter Battaglia, Victor Bapst, David Raposo, Adam Santoro, Nando de Freitas
  • 63

Abstract

We introduce hyperbolic attention networks to endow neural networks withenough capacity to match the complexity of data with hierarchical and power-lawstructure. A few recent approaches have successfully demonstrated the benefitsof imposing hyperbolic geometry on the parameters of shallow networks. Weextend this line of work by imposing hyperbolic geometry on the activations ofneural networks. This allows us to exploit hyperbolic geometry to reason aboutembeddings produced by deep networks. We achieve this by re-expressing theubiquitous mechanism of soft attention in terms of operations defined forhyperboloid and Klein models. Our method shows improvements in terms ofgeneralization on neural machine translation, learning on graphs and visualquestion answering tasks while keeping the neural representations compact.

 

Quick Read (beta)

loading the full paper ...