Abstract
Building safe Large Language Models (LLMs) across multiple languages isessential in ensuring both safe access and linguistic diversity. To this end,we introduce M-ALERT, a multilingual benchmark that evaluates the safety ofLLMs in five languages: English, French, German, Italian, and Spanish. M-ALERTincludes 15k high-quality prompts per language, totaling 75k, following thedetailed ALERT taxonomy. Our extensive experiments on 10 state-of-the-art LLMshighlight the importance of language-specific safety analysis, revealing thatmodels often exhibit significant inconsistencies in safety across languages andcategories. For instance, Llama3.2 shows high unsafety in the categorycrime_tax for Italian but remains safe in other languages. Similar differencescan be observed across all models. In contrast, certain categories, such assubstance_cannabis and crime_propaganda, consistently trigger unsafe responsesacross models and languages. These findings underscore the need for robustmultilingual safety practices in LLMs to ensure safe and responsible usageacross diverse user communities.