Abstract
Gender-neutral language reflects societal and linguistic shifts towardsgreater inclusivity by avoiding the implication that one gender is the normover others. This is particularly relevant for grammatical gender languages,which heavily encode the gender of terms for human referents and over-relies onmasculine forms, even when gender is unspecified or irrelevant. Languagetechnologies are known to mirror these inequalities, being affected by a malebias and perpetuating stereotypical associations when translating intolanguages with extensive gendered morphology. In such cases, gender-neutrallanguage can help avoid undue binary assumptions. However, despite itsimportance for creating fairer multi- and cross-lingual technologies, inclusivelanguage research remains scarce and insufficiently supported in currentresources. To address this gap, we present the multilingual mGeNTe dataset.Derived from the bilingual GeNTE (Piergentili et al., 2023), mGeNTE extends theoriginal corpus to include the English-Italian/German/Spanish language pairs.Since each language pair is English-aligned with gendered and neutral sentencesin the target languages, mGeNTE enables research in both automaticGender-Neutral Translation (GNT) and language modelling for three grammaticalgender languages.