Multilingual LLMs Inherently Reward In-Language Time-Sensitive Semantic Alignment for Low-Resource Languages

  • 2024-12-11 04:16:39
  • Ashutosh Bajpai, Tanmoy Chakraborty
  • 0

Abstract

The unwavering disparity in labeled resources between resource-rich languagesand those considered low-resource remains a significant impediment for LargeLanguage Models (LLMs). Recent strides in cross-lingual in-context learning(X-ICL), mainly through semantically aligned examples retrieved frommultilingual pre-trained transformers, have shown promise in mitigating thisissue. However, our investigation reveals that LLMs intrinsically rewardin-language semantically aligned cross-lingual instances over directcross-lingual semantic alignments, with a pronounced disparity in handlingtime-sensitive queries in the X-ICL setup. Such queries demand sound temporalreasoning ability from LLMs, yet the advancements have predominantly focused onEnglish. This study aims to bridge this gap by improving temporal reasoningcapabilities in low-resource languages. To this end, we introduce mTEMPREASON atemporal reasoning dataset aimed at the varied degrees of low-resourcelanguages and propose Cross-Lingual Time-Sensitive Semantic Alignment(CLiTSSA), a novel method to improve temporal reasoning in these contexts. Tofacilitate this, we construct an extension of mTEMPREASON comprising pairs ofparallel cross-language temporal queries along with their anticipatedin-language semantic similarity scores. Our empirical evidence underscores thesuperior performance of CLiTSSA compared to established baselines across threelanguages - Romanian, German, and French, encompassing three temporal tasks andincluding a diverse set of four contemporaneous LLMs. This marks a significantstep forward in addressing resource disparity in the context of temporalreasoning across languages.

 

Quick Read (beta)

loading the full paper ...