LLMs Are Few-Shot In-Context Low-Resource Language Learners

Abstract

In-context learning (ICL) empowers large language models (LLMs) to performdiverse tasks in underrepresented languages using only short in-contextinformation, offering a crucial avenue for narrowing the gap betweenhigh-resource and low-resource languages. Nonetheless, there is only a handfulof works explored ICL for low-resource languages with most of them focusing onrelatively high-resource languages, such as French and Spanish. In this work,we extensively study ICL and its cross-lingual variation (X-ICL) on 25low-resource and 7 relatively higher-resource languages. Our study not onlyassesses the effectiveness of ICL with LLMs in low-resource languages but alsoidentifies the shortcomings of in-context label alignment, and introduces amore effective alternative: query alignment. Moreover, we provide valuableinsights into various facets of ICL for low-resource languages. Our studyconcludes the significance of few-shot in-context information on enhancing thelow-resource understanding quality of LLMs through semantically relevantinformation by closing the language gap in the target language and aligning thesemantics between the targeted low-resource and the high-resource language thatthe model is proficient in. Our work highlights the importance of advancing ICLresearch, particularly for low-resource languages.

Quick Read (beta)

loading the full paper ...