Abstract
For languages with no annotated resources, unsupervised transfer of naturallanguage processing models such as named-entity recognition (NER) fromresource-rich languages would be an appealing capability. However, differencesin words and word order across languages make it a challenging problem. Toimprove mapping of lexical items across languages, we propose a method thatfinds translations based on bilingual word embeddings. To improve robustness toword order differences, we propose to use self-attention, which allows for adegree of flexibility with respect to word order. We demonstrate that thesemethods achieve state-of-the-art or competitive NER performance on commonlytested languages under a cross-lingual setting, with much lower resourcerequirements than past approaches. We also evaluate the challenges of applyingthese methods to Uyghur, a low-resource language.