Abstract
Despite the major advances in NLP, significant disparities in NLP systemperformance across languages still exist. Arguably, these are due to unevenresource allocation and sub-optimal incentives to work on less resourcedlanguages. To track and further incentivize the global development of equitablelanguage technology, we introduce GlobalBench. Prior multilingual benchmarksare static and have focused on a limited number of tasks and languages. Incontrast, GlobalBench is an ever-expanding collection that aims to dynamicallytrack progress on all NLP datasets in all languages. Rather than solelymeasuring accuracy, GlobalBench also tracks the estimated per-speaker utilityand equity of technology across all languages, providing a multi-faceted viewof how language technology is serving people of the world. Furthermore,GlobalBench is designed to identify the most under-served languages, andrewards research efforts directed towards those languages. At present, the mostunder-served languages are the ones with a relatively high population, butnonetheless overlooked by composite multilingual benchmarks (like Punjabi,Portuguese, and Wu Chinese). Currently, GlobalBench covers 966 datasets in 190languages, and has 1,128 system submissions spanning 62 languages.