mCSQA: Multilingual Commonsense Reasoning Dataset with Unified Creation Strategy by Language Models and Humans

Abstract

It is very challenging to curate a dataset for language-specific knowledgeand common sense in order to evaluate natural language understandingcapabilities of language models. Due to the limitation in the availability ofannotators, most current multilingual datasets are created through translation,which cannot evaluate such language-specific aspects. Therefore, we proposeMultilingual CommonsenseQA (mCSQA) based on the construction process of CSQAbut leveraging language models for a more efficient construction, e.g., byasking LM to generate questions/answers, refine answers and verify QAs followedby reduced human efforts for verification. Constructed dataset is a benchmarkfor cross-lingual language-transfer capabilities of multilingual LMs, andexperimental results showed high language-transfer capabilities for questionsthat LMs could easily solve, but lower transfer capabilities for questionsrequiring deep knowledge or commonsense. This highlights the necessity oflanguage-specific datasets for evaluation and training. Finally, our methoddemonstrated that multilingual LMs could create QA including language-specificknowledge, significantly reducing the dataset creation cost compared to manualcreation. The datasets are available athttps://huggingface.co/datasets/yusuke1997/mCSQA.

Quick Read (beta)

loading the full paper ...