Garain at SemEval-2020 Task 12: Sequence based Deep Learning for Categorizing Offensive Language in Social Media

Abstract

SemEval-2020 Task 12 was OffenseEval: Multilingual Offensive LanguageIdentification in Social Media (Zampieri et al., 2020). The task was subdividedinto multiple languages and datasets were provided for each one. The task wasfurther divided into three sub-tasks: offensive language identification,automatic categorization of offense types, and offense target identification. Ihave participated in the task-C, that is, offense target identification. Forpreparing the proposed system, I have made use of Deep Learning networks likeLSTMs and frameworks like Keras which combine the bag of words model withautomatically generated sequence based features and manually extracted featuresfrom the given dataset. My system on training on 25% of the whole datasetachieves macro averaged f1 score of 47.763%.

Quick Read (beta)

loading the full paper ...