Does Synthetic Data Help Named Entity Recognition for Low-Resource Languages?

  • 2025-11-05 18:06:35
  • Gaurav Kamath, Sowmya Vajjala
  • 0

Abstract

Named Entity Recognition(NER) for low-resource languages aims to producerobust systems for languages where there is limited labeled training dataavailable, and has been an area of increasing interest within NLP. Dataaugmentation for increasing the amount of low-resource labeled data is a commonpractice. In this paper, we explore the role of synthetic data in the contextof multilingual, low-resource NER, considering 11 languages from diverselanguage families. Our results suggest that synthetic data does in fact holdpromise for low-resource language NER, though we see significant variationbetween languages.

 

Quick Read (beta)

loading the full paper ...