Offensive Language Detection: A Comparative Analysis

Abstract

Offensive behaviour has become pervasive in the Internet community.Individuals take the advantage of anonymity in the cyber world and indulge inoffensive communications which they may not consider in the real life.Governments, online communities, companies etc are investing into prevention ofoffensive behaviour content in social media. One of the most effective solutionfor tacking this enigmatic problem is the use of computational techniques toidentify offensive content and take action. The current work focuses ondetecting offensive language in English tweets. The dataset used for theexperiment is obtained from SemEval-2019 Task 6 on Identifying and CategorizingOffensive Language in Social Media (OffensEval). The dataset contains 14,460annotated English tweets. The present paper provides a comparative analysis andRandom kitchen sink (RKS) based approach for offensive language detection. Weexplore the effectiveness of Google sentence encoder, Fasttext, Dynamic modedecomposition (DMD) based features and Random kitchen sink (RKS) method foroffensive language detection. From the experiments and evaluation we observedthat RKS with fastetxt achieved competing results. The evaluation measures usedare accuracy, precision, recall, f1-score.

Quick Read (beta)

loading the full paper ...