Training Compact Models for Low Resource Entity Tagging using Pre-trained Language Models

Abstract

Training models on low-resource named entity recognition tasks has been shownto be a challenge, especially in industrial applications where deployingupdated models a continuous effort and crucial for business operations. Oftenin such cases, there is abundance of unlabeled data, however, labeled data isscarce or unavailable. Pre-trained language models trained to extractcontextual features from text were shown to improve many natural languageprocessing (NLP) tasks, including scarcely labeled tasks, by leveraging ontransfer learning. However, such models impose a heavy memory and computationalburden, making it a challenge to train and deploy such model for inference use.In this work-in-progress we combined the effectiveness of transfer learningprovided by pre-trained masked language models and use a semi-supervisedapproach to train a fast and compact model using labeled and unlabeledexamples. Preliminary evaluations show that the compact models achievecompetitive accuracy compared to a state-of-art pre-trained language modelswith up to 36x compression rate and run significantly faster in inference,thus, allowing deployment of such models in production environments or on edgedevices.

Quick Read (beta)

loading the full paper ...