On the Importance of Word Order Information in Cross-lingual Sequence Labeling

Abstract

Word order variances generally exist in different languages. In this paper,we hypothesize that cross-lingual models that fit into the word order of thesource language might fail to handle target languages. To verify thishypothesis, we investigate whether making models insensitive to the word orderof the source language can improve the adaptation performance in targetlanguages. To do so, we reduce the source language word order informationfitted to sequence encoders and observe the performance changes. In addition,based on this hypothesis, we propose a new method for fine-tuning multilingualBERT in downstream cross-lingual sequence labeling tasks. Experimental resultson dialogue natural language understanding, part-of-speech tagging, and namedentity recognition tasks show that reducing word order information fitted tothe model can achieve better zero-shot cross-lingual performance. Furthermore,our proposed methods can also be applied to strong cross-lingual baselines, andimprove their performances.

Quick Read (beta)

loading the full paper ...