Low-Resource Syntactic Transfer with Unsupervised Source Reordering

  • 2019-03-13 19:01:00
  • Mohammad Sadegh Rasooli, Michael Collins
  • 18

Abstract

We describe a cross-lingual transfer method for dependency parsing that takesinto account the problem of word order differences between source and targetlanguages. Our model only relies on the Bible, a considerably smaller paralleldata than the commonly used parallel data in transfer methods. We use theconcatenation of projected trees from the Bible corpus, and the gold-standardtreebanks in multiple source languages along with cross-lingual wordrepresentations. We demonstrate that reordering the source treebanks beforetraining on them for a target language improves the accuracy of languagesoutside the European language family. Our experiments on 68 treebanks (38languages) in the Universal Dependencies corpus achieve a high accuracy for alllanguages. Among them, our experiments on 16 treebanks of 12 non-Europeanlanguages achieve an average UAS absolute improvement of 3.3% over astate-of-the-art method.

 

Introduction (beta)

None

 

Conclusion (beta)

None