An Augmented Transformer Architecture for Natural Language Generation Tasks

Abstract

The Transformer based neural networks have been showing significantadvantages on most evaluations of various natural language processing and othersequence-to-sequence tasks due to its inherent architecture basedsuperiorities. Although the main architecture of the Transformer has beencontinuously being explored, little attention was paid to the positionalencoding module. In this paper, we enhance the sinusoidal positional encodingalgorithm by maximizing the variances between encoded consecutive positions toobtain additional promotion. Furthermore, we propose an augmented Transformerarchitecture encoded with additional linguistic knowledge, such as thePart-of-Speech (POS) tagging, to boost the performance on some natural languagegeneration tasks, e.g., the automatic translation and summarization tasks.Experiments show that the proposed architecture attains constantly superiorresults compared to the vanilla Transformer.

Quick Read (beta)

loading the full paper ...