Cross-lingual Universal Dependency Parsing Only from One Monolingual Treebank

  • 2021-04-23 06:36:16
  • Kailai Sun, Zuchao Li, Hai Zhao
  • 0

Abstract

Syntactic parsing is a highly linguistic processing task whose parserrequires training on treebanks from the expensive human annotation. As it isunlikely to obtain a treebank for every human language, in this work, wepropose an effective cross-lingual UD parsing framework for transferring parserfrom only one source monolingual treebank to any other target languages withouttreebank available. To reach satisfactory parsing accuracy among quitedifferent languages, we introduce two language modeling tasks into dependencyparsing as multi-tasking. Assuming only unlabeled data from target languagesplus the source treebank can be exploited together, we adopt a self-trainingstrategy for further performance improvement in terms of our multi-taskframework. Our proposed cross-lingual parsers are implemented for English,Chinese, and 22 UD treebanks. The empirical study shows that our cross-lingualparsers yield promising results for all target languages, for the first time,approaching the parser performance which is trained in its own target treebank.

 

Quick Read (beta)

loading the full paper ...