Hate-Alert@DravidianLangTech-EACL2021: Ensembling strategies for Transformer-based Offensive language Detection

Abstract

Social media often acts as breeding grounds for different forms of offensivecontent. For low resource languages like Tamil, the situation is more complexdue to the poor performance of multilingual or language-specific models andlack of proper benchmark datasets. Based on this shared task, OffensiveLanguage Identification in Dravidian Languages at EACL 2021, we present anexhaustive exploration of different transformer models, We also provide agenetic algorithm technique for ensembling different models. Our ensembledmodels trained separately for each language secured the first position inTamil, the second position in Kannada, and the first position in Malayalamsub-tasks. The models and codes are provided.

Quick Read (beta)

loading the full paper ...