Enhancing Small Language Models for Cross-Lingual Generalized Zero-Shot Classification with Soft Prompt Tuning

Abstract

In NLP, Zero-Shot Classification (ZSC) has become essential for enablingmodels to classify text into categories unseen during training, particularly inlow-resource languages and domains where labeled data is scarce. Whilepretrained language models (PLMs) have shown promise in ZSC, they often rely onlarge training datasets or external knowledge, limiting their applicability inmultilingual and low-resource scenarios. Recent approaches leveraging naturallanguage prompts reduce the dependence on large training datasets but struggleto effectively incorporate available labeled data from related classificationtasks, especially when these datasets originate from different languages ordistributions. Moreover, existing prompt-based methods typically rely onmanually crafted prompts in a specific language, limiting their adaptabilityand effectiveness in cross-lingual settings. To address these challenges, weintroduce RoSPrompt, a lightweight and data-efficient approach for trainingsoft prompts that enhance cross-lingual ZSC while ensuring robustgeneralization across data distribution shifts. RoSPrompt is designed for smallmultilingual PLMs, enabling them to leverage high-resource languages to improveperformance in low-resource settings without requiring extensive fine-tuning orhigh computational costs. We evaluate our approach on multiple multilingualPLMs across datasets covering 106 languages, demonstrating strong cross-lingualtransfer performance and robust generalization capabilities over unseenclasses.

Quick Read (beta)

loading the full paper ...