Abstract
In countries that speak multiple main languages, mixing up differentlanguages within a conversation is commonly called code-switching. Previousworks addressing this challenge mainly focused on word-level aspects such asword embeddings. However, in many cases, languages share common subwords,especially for closely related languages, but also for languages that areseemingly irrelevant. Therefore, we propose Hierarchical Meta-Embeddings (HME)that learn to combine multiple monolingual word-level and subword-levelembeddings to create language-agnostic lexical representations. On the task ofNamed Entity Recognition for English-Spanish code-switching data, our modelachieves the state-of-the-art performance in the multilingual settings. We alsoshow that, in cross-lingual settings, our model not only leverages closelyrelated languages, but also learns from languages with different roots.Finally, we show that combining different subunits are crucial for capturingcode-switching entities.