Unraveling the Dominance of Large Language Models Over Transformer Models for Bangla Natural Language Inference: A Comprehensive Study

Abstract

Natural Language Inference (NLI) is a cornerstone of Natural LanguageProcessing (NLP), providing insights into the entailment relationships betweentext pairings. It is a critical component of Natural Language Understanding(NLU), demonstrating the ability to extract information from spoken or writteninteractions. NLI is mainly concerned with determining the entailmentrelationship between two statements, known as the premise and hypothesis. Whenthe premise logically implies the hypothesis, the pair is labeled``entailment''. If the hypothesis contradicts the premise, the pair receivesthe ``contradiction'' label. When there is insufficient evidence to establish aconnection, the pair is described as ``neutral''. Despite the success of LargeLanguage Models (LLMs) in various tasks, their effectiveness in NLI remainsconstrained by issues like low-resource domain accuracy, model overconfidence,and difficulty in capturing human judgment disagreements. This study addressesthe underexplored area of evaluating LLMs in low-resourced languages such asBengali. Through a comprehensive evaluation, we assess the performance ofprominent LLMs and state-of-the-art (SOTA) models in Bengali NLP tasks,focusing on natural language inference. Utilizing the XNLI dataset, we conductzero-shot and few-shot evaluations, comparing LLMs like GPT-3.5 Turbo andGemini 1.5 Pro with models such as BanglaBERT, Bangla BERT Base, DistilBERT,mBERT, and sahajBERT. Our findings reveal that while LLMs can achievecomparable or superior performance to fine-tuned SOTA models in few-shotscenarios, further research is necessary to enhance our understanding of LLMsin languages with modest resources like Bengali. This study underscores theimportance of continued efforts in exploring LLM capabilities across diverselinguistic contexts.

Quick Read (beta)

loading the full paper ...