Reinforced Iterative Knowledge Distillation for Cross-Lingual Named Entity Recognition

Abstract

Named entity recognition (NER) is a fundamental component in manyapplications, such as Web Search and Voice Assistants. Although deep neuralnetworks greatly improve the performance of NER, due to the requirement oflarge amounts of training data, deep neural networks can hardly scale out tomany languages in an industry setting. To tackle this challenge, cross-lingualNER transfers knowledge from a rich-resource language to languages with lowresources through pre-trained multilingual language models. Instead of usingtraining data in target languages, cross-lingual NER has to rely on onlytraining data in source languages, and optionally adds the translated trainingdata derived from source languages. However, the existing cross-lingual NERmethods do not make good use of rich unlabeled data in target languages, whichis relatively easy to collect in industry applications. To address theopportunities and challenges, in this paper we describe our novel practice inMicrosoft to leverage such large amounts of unlabeled data in target languagesin real production settings. To effectively extract weak supervision signalsfrom the unlabeled data, we develop a novel approach based on the ideas ofsemi-supervised learning and reinforcement learning. The empirical study onthree benchmark data sets verifies that our approach establishes the newstate-of-the-art performance with clear edges. Now, the NER techniques reportedin this paper are on their way to become a fundamental component for Webranking, Entity Pane, Answers Triggering, and Question Answering in theMicrosoft Bing search engine. Moreover, our techniques will also serve as partof the Spoken Language Understanding module for a commercial voice assistant.We plan to open source the code of the prototype framework after deployment.

Quick Read (beta)

loading the full paper ...