Abstract
Attention is an increasingly popular mechanism used in a wide range of neuralarchitectures. Because of the fast-paced advances in this domain, a systematicoverview of attention is still missing. In this article, we define a unifiedmodel for attention architectures for natural language processing, with a focuson architectures designed to work with vector representation of the textualdata. We discuss the dimensions along which proposals differ, the possible usesof attention, and chart the major research activities and open challenges inthe area.
Quick Read (beta)
loading the full paper ...