Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding

Abstract

Large Language Models (LLMs) show promising learning and reasoning abilities.Compared to other NLP tasks, multilingual and multi-label emotion evaluationtasks are under-explored in LLMs. In this paper, we present EthioEmo, amulti-label emotion classification dataset for four Ethiopian languages,namely, Amharic (amh), Afan Oromo (orm), Somali (som), and Tigrinya (tir). Weperform extensive experiments with an additional English multi-label emotiondataset from SemEval 2018 Task 1. Our evaluation includes encoder-only,encoder-decoder, and decoder-only language models. We compare zero and few-shotapproaches of LLMs to fine-tuning smaller language models. The results showthat accurate multi-label emotion classification is still insufficient even forhigh-resource languages such as English, and there is a large gap between theperformance of high-resource and low-resource languages. The results also showvarying performance levels depending on the language and model type. EthioEmois available publicly to further improve the understanding of emotions inlanguage models and how people convey emotions through various languages.

Quick Read (beta)

loading the full paper ...