A deep Natural Language Inference predictor without language-specific training data

Abstract

In this paper we present a technique of NLP to tackle the problem ofinference relation (NLI) between pairs of sentences in a target language ofchoice without a language-specific training dataset. We exploit a generictranslation dataset, manually translated, along with two instances of the samepre-trained model - the first to generate sentence embeddings for the sourcelanguage, and the second fine-tuned over the target language to mimic thefirst. This technique is known as Knowledge Distillation. The model has beenevaluated over machine translated Stanford NLI test dataset, machine translatedMulti-Genre NLI test dataset, and manually translated RTE3-ITA test dataset. Wealso test the proposed architecture over different tasks to empiricallydemonstrate the generality of the NLI task. The model has been evaluated overthe native Italian ABSITA dataset, on the tasks of Sentiment Analysis,Aspect-Based Sentiment Analysis, and Topic Recognition. We emphasise thegenerality and exploitability of the Knowledge Distillation technique thatoutperforms other methodologies based on machine translation, even though theformer was not directly trained on the data it was tested over.

Quick Read (beta)

loading the full paper ...