Improving Relation Extraction by Pre-trained Language Representations

Abstract

Current state-of-the-art relation extraction methods typically rely on a setof lexical, syntactic, and semantic features, explicitly computed in apre-processing step. Training feature extraction models requires additionalannotated language resources, which severely restricts the applicability andportability of relation extraction to novel languages. Similarly,pre-processing introduces an additional source of error. To address theselimitations, we introduce TRE, a Transformer for Relation Extraction, extendingthe OpenAI Generative Pre-trained Transformer [Radford et al., 2018]. Unlikeprevious relation extraction models, TRE uses pre-trained deep languagerepresentations instead of explicit linguistic features to inform the relationclassification and combines it with the self-attentive Transformer architectureto effectively model long-range dependencies between entity mentions. TREallows us to learn implicit linguistic features solely from plain text corporaby unsupervised pre-training, before fine-tuning the learned languagerepresentations on the relation extraction task. TRE obtains a newstate-of-the-art result on the TACRED and SemEval 2010 Task 8 datasets,achieving a test F1 of 67.4 and 87.1, respectively. Furthermore, we observe asignificant increase in sample efficiency. With only 20% of the trainingexamples, TRE matches the performance of our baselines and our model trainedfrom scratch on 100% of the TACRED dataset. We open-source our trained models,experiments, and source code.

Quick Read (beta)

loading the full paper ...