Language Specific Knowledge: Do Models Know Better in X than in English?

Abstract

Code-switching is a common phenomenon of alternating between differentlanguages in the same utterance, thought, or conversation. We posit that humanscode-switch because they feel more comfortable talking about certain topics anddomains in one language than another. With the rise of knowledge-intensivelanguage models, we ask ourselves the next, natural question: Could models holdmore knowledge on some topics in some language X? More importantly, could weimprove reasoning by changing the language that reasoning is performed in? Wecoin the term Language Specific Knowledge (LSK) to represent this phenomenon.As ethnic cultures tend to develop alongside different languages, we employculture-specific datasets (that contain knowledge about cultural and socialbehavioral norms). We find that language models can perform better when usingchain-of-thought reasoning in some languages other than English, sometimes evenbetter in low-resource languages. Paired with previous works showing thatsemantic similarity does not equate to representational similarity, wehypothesize that culturally specific texts occur more abundantly incorresponding languages, enabling specific knowledge to occur only in specific"expert" languages. Motivated by our initial results, we design a simplemethodology called LSKExtractor to benchmark the language-specific knowledgepresent in a language model and, then, exploit it during inference. We show ourresults on various models and datasets, showing an average relative improvementof 10% in accuracy. Our research contributes to the open-source development oflanguage models that are inclusive and more aligned with the cultural andlinguistic contexts in which they are deployed.

Quick Read (beta)

loading the full paper ...