From English to Code-Switching: Transfer Learning with Strong Morphological Clues

Abstract

Code-switching is still an understudied phenomenon in natural languageprocessing mainly because of two related challenges: it lacks annotated data,and it combines a vast diversity of low-resource languages. Despite thelanguage diversity, many code-switching scenarios occur in language pairs, andEnglish is often a common factor among them. In the first part of this paper,we use transfer learning from English to English-paired code-switched languagesfor the language identification (LID) task by applying two simple yet effectivetechniques: 1) a hierarchical attention mechanism that enhances morphologicalclues from character n-grams, and 2) a secondary loss that forces the model tolearn n-gram representations that are particular to the languages involved. Weuse the bottom layers of the ELMo architecture to learn these morphologicalclues by essentially recognizing what is and what is not English. Our approachoutperforms the previous state of the art on Nepali-English, Spanish-English,and Hindi-English datasets. In the second part of the paper, we use our bestLID models for the tasks of Spanish-English named entity recognition andHindi-English part-of-speech tagging by replacing their inference layers andretraining them. We show that our retrained models are capable of using thecode-switching information on both tasks to outperform models that do not havesuch knowledge.

Quick Read (beta)

loading the full paper ...