Abstract
We present UDify, a multilingual multi-task model capable of accuratelypredicting universal part-of-speech, morphological features, lemmas, anddependency trees simultaneously for all 124 Universal Dependencies treebanksacross 75 languages. By leveraging a multilingual BERT self-attention modelpretrained on 104 languages, we found that fine-tuning it on all datasetsconcatenated together with simple softmax classifiers for each UD task canresult in state-of-the-art UPOS, UFeats, Lemmas, UAS, and LAS scores, withoutrequiring any recurrent or language-specific components. We evaluate UDify formultilingual learning, showing that low-resource languages benefit the mostfrom cross-linguistic annotations. We also evaluate for zero-shot learning,with results suggesting that multilingual training provides strong UDpredictions even for languages that neither UDify nor BERT have ever beentrained on. Code for UDify is available athttps://github.com/hyperparticle/udify.