Speaking Multiple Languages Affects the Moral Bias of Language Models

Abstract

Pre-trained multilingual language models (PMLMs) are commonly used whendealing with data from multiple languages and cross-lingual transfer. However,PMLMs are trained on varying amounts of data for each language. In practicethis means their performance is often much better on English than many otherlanguages. We explore to what extent this also applies to moral norms. Do themodels capture moral norms from English and impose them on other languages? Dothe models exhibit random and thus potentially harmful beliefs in certainlanguages? Both these issues could negatively impact cross-lingual transfer andpotentially lead to harmful outcomes. In this paper, we (1) apply theMoralDirection framework to multilingual models, comparing results in German,Czech, Arabic, Mandarin Chinese, and English, (2) analyse model behaviour onfiltered parallel subtitles corpora, and (3) apply the models to a MoralFoundations Questionnaire, comparing with human responses from differentcountries. Our experiments demonstrate that, indeed, PMLMs encode differingmoral biases, but these do not necessarily correspond to cultural differencesor commonalities in human opinions.

Quick Read (beta)

loading the full paper ...