Susu Box or Piggy Bank: Assessing Cultural Commonsense Knowledge between Ghana and the U.S

  • 2024-10-23 02:56:15
  • Christabel Acquaye, Haozhe An, Rachel Rudinger
Recent work has highlighted the culturally-contingent nature of commonsenseknowledge. We introduce AMAMMER${\epsilon}$, a test set of 525 multiple-choicequestions designed to evaluate the commonsense knowledge of English LLMs,relative to the cultural contexts of Ghana and the United States. To createAMAMMER${\epsilon}$, we select a set of multiple-choice questions (MCQs) fromexisting commonsense datasets and rewrite them in a multi-stage processinvolving surveys of Ghanaian and U.S. participants. In three rounds ofsurveys, participants from both pools are solicited to (1) write correct andincorrect answer choices, (2) rate individual answer choices on a 5-pointLikert scale, and (3) select the best answer choice from the newly-constructedMCQ items, in a final validation step. By engaging participants at multiplestages, our procedure ensures that participant perspectives are incorporatedboth in the creation and validation of test items, resulting in high levels ofagreement within each pool. We evaluate several off-the-shelf English LLMs onAMAMMER${\epsilon}$. Uniformly, models prefer answers choices that align withthe preferences of U.S. annotators over Ghanaian annotators. Additionally, whentest items specify a cultural context (Ghana or the U.S.), models exhibit someability to adapt, but performance is consistently better in U.S. contexts thanGhanaian. As large resources are devoted to the advancement of English LLMs,our findings underscore the need for culturally adaptable models andevaluations to meet the needs of diverse English-speaking populations aroundthe world.


