Abstract
Understanding human language is one of the key themes of artificialintelligence. For language representation, the capacity of effectively modelingthe linguistic knowledge from the detail-riddled and lengthy texts and gettingrid of the noises is essential to improve its performance. Traditionalattentive models attend to all words without explicit constraint, which resultsin inaccurate concentration on some dispensable words. In this work, we proposeusing syntax to guide the text modeling by incorporating explicit syntacticconstraints into attention mechanisms for better linguistically motivated wordrepresentations. In detail, for self-attention network (SAN) sponsoredTransformer-based encoder, we introduce syntactic dependency of interest (SDOI)design into the SAN to form an SDOI-SAN with syntax-guided self-attention.Syntax-guided network (SG-Net) is then composed of this extra SDOI-SAN and theSAN from the original Transformer encoder through a dual contextualarchitecture for better linguistics inspired representation. The proposedSG-Net is applied to typical Transformer encoders. Extensive experiments onpopular benchmark tasks, including machine reading comprehension, naturallanguage inference, and neural machine translation show the effectiveness ofthe proposed SG-Net design.