The Importance of Being Recurrent for Modeling Hierarchical Structure

Abstract

Recent work has shown that recurrent neural networks (RNNs) can implicitlycapture and exploit hierarchical information when trained to solve commonnatural language processing tasks such as language modeling (Linzen et al.,2016) and neural machine translation (Shi et al., 2016). In contrast, theability to model structured data with non-recurrent neural networks hasreceived little attention despite their success in many NLP tasks (Gehring etal., 2017; Vaswani et al., 2017). In this work, we compare the twoarchitectures---recurrent versus non-recurrent---with respect to their abilityto model hierarchical structure and find that recurrency is indeed importantfor this purpose.

Quick Read (beta)

loading the full paper ...