TartuNLP @ SIGTYP 2024 Shared Task: Adapting XLM-RoBERTa for Ancient and Historical Languages

Abstract

We present our submission to the unconstrained subtask of the SIGTYP 2024Shared Task on Word Embedding Evaluation for Ancient and Historical Languagesfor morphological annotation, POS-tagging, lemmatization, character- andword-level gap-filling. We developed a simple, uniform, and computationallylightweight approach based on the adapters framework using parameter-efficientfine-tuning. We applied the same adapter-based approach uniformly to all tasksand 16 languages by fine-tuning stacked language- and task-specific adapters.Our submission obtained an overall second place out of three submissions, withthe first place in word-level gap-filling. Our results show the feasibility ofadapting language models pre-trained on modern languages to historical andancient languages via adapter training.

Quick Read (beta)

loading the full paper ...