Cross-lingual hate speech detection based on multilingual domain-specific word embeddings

Abstract

Automatic hate speech detection in online social networks is an importantopen problem in Natural Language Processing (NLP). Hate speech is amultidimensional issue, strongly dependant on language and cultural factors.Despite its relevance, research on this topic has been almost exclusivelydevoted to English. Most supervised learning resources, such as labeleddatasets and NLP tools, have been created for this same language. Consideringthat a large portion of users worldwide speak in languages other than English,there is an important need for creating efficient approaches for multilingualhate speech detection. In this work we propose to address the problem ofmultilingual hate speech detection from the perspective of transfer learning.Our goal is to determine if knowledge from one particular language can be usedto classify other language, and to determine effective ways to achieve this. Wepropose a hate specific data representation and evaluate its effectivenessagainst general-purpose universal representations most of which, unlike ourproposed model, have been trained on massive amounts of data. We focus on across-lingual setting, in which one needs to classify hate speech in onelanguage without having access to any labeled data for that language. We showthat the use of our simple yet specific multilingual hate representationsimproves classification results. We explain this with a qualitative analysisshowing that our specific representation is able to capture some commonpatterns in how hate speech presents itself in different languages. Our proposal constitutes, to the best of our knowledge, the first attempt forconstructing multilingual specific-task representations. Despite itssimplicity, our model outperformed the previous approaches for most of theexperimental setups. Our findings can orient future solutions toward the use ofdomain-specific representations.

Quick Read (beta)

loading the full paper ...