Abstract
Training models on low-resource named entity recognition tasks has been shownto be a challenge, especially in industrial applications where deployingupdated models a continuous effort and crucial for business operations. Oftenin such cases, there is abundance of unlabeled data, however, labeled data isscarce or unavailable. Pre-trained language models trained to extractcontextual features from text were shown to improve many natural languageprocessing (NLP) tasks, including scarcely labeled tasks, by leveraging ontransfer learning. However, such models impose a heavy memory and computationalburden, making it a challenge to train and deploy such model for inference use.In this work-in-progress we combined the effectiveness of transfer learningprovided by pre-trained masked language models and use a semi-supervisedapproach to train a fast and compact model using labeled and unlabeledexamples. Preliminary evaluations show that the compact models achievecompetitive accuracy compared to a state-of-art pre-trained language modelswith up to 36x compression rate and run significantly faster in inference,thus, allowing deployment of such models in production environments or on edgedevices.