Retrofitting Structure-aware Transformer Language Model for End Tasks

Abstract

We consider retrofitting structure-aware Transformer-based language model forfacilitating end tasks by proposing to exploit syntactic distance to encodeboth the phrasal constituency and dependency connection into the languagemodel. A middle-layer structural learning strategy is leveraged for structureintegration, accomplished with main semantic task training under multi-tasklearning scheme. Experimental results show that the retrofitted structure-awareTransformer language model achieves improved perplexity, meanwhile inducingaccurate syntactic phrases. By performing structure-aware fine-tuning, ourmodel achieves significant improvements for both semantic- andsyntactic-dependent tasks.

Quick Read (beta)

loading the full paper ...