Native vs Non-Native Language Prompting: A Comparative Analysis

Abstract

Large language models (LLMs) have shown remarkable abilities in differentfields, including standard Natural Language Processing (NLP) tasks. To elicitknowledge from LLMs, prompts play a key role, consisting of natural languageinstructions. Most open and closed source LLMs are trained on available labeledand unlabeled resources--digital content such as text, images, audio, andvideos. Hence, these models have better knowledge for high-resourced languagesbut struggle with low-resourced languages. Since prompts play a crucial role inunderstanding their capabilities, the language used for prompts remains animportant research question. Although there has been significant research inthis area, it is still limited, and less has been explored for medium tolow-resourced languages. In this study, we investigate different promptingstrategies (native vs. non-native) on 11 different NLP tasks associated with 12different Arabic datasets (9.7K data points). In total, we conducted 197experiments involving 3 LLMs, 12 datasets, and 3 prompting strategies. Ourfindings suggest that, on average, the non-native prompt performs the best,followed by mixed and native prompts.

Quick Read (beta)

loading the full paper ...