Sentiment Analysis Across Multiple African Languages: A Current Benchmark

Abstract

Sentiment analysis is a fundamental and valuable task in NLP. However, due tolimitations in data and technological availability, research into sentimentanalysis of African languages has been fragmented and lacking. With the recentrelease of the AfriSenti-SemEval Shared Task 12, hosted as a part of The 17thInternational Workshop on Semantic Evaluation, an annotated sentiment analysisof 14 African languages was made available. We benchmarked and compared currentstate-of-art transformer models across 12 languages and compared theperformance of training one-model-per-language versussingle-model-all-languages. We also evaluated the performance of standardmultilingual models and their ability to learn and transfer cross-lingualrepresentation from non-African to African languages. Our results show thatdespite work in low resource modeling, more data still produces better modelson a per-language basis. Models explicitly developed for African languagesoutperform other models on all tasks. Additionally, no one-model-fits-allsolution exists for a per-language evaluation of the models evaluated.Moreover, for some languages with a smaller sample size, a larger multilingualmodel may perform better than a dedicated per-language model for sentimentclassification.

Quick Read (beta)

loading the full paper ...