Predicting Human Psychometric Properties Using Computational Language Models

Abstract

Transformer-based language models (LMs) continue to achieve state-of-the-artperformance on natural language processing (NLP) benchmarks, including tasksdesigned to mimic human-inspired "commonsense" competencies. To betterunderstand the degree to which LMs can be said to have certain linguisticreasoning skills, researchers are beginning to adapt the tools and conceptsfrom psychometrics. But to what extent can benefits flow in the otherdirection? In other words, can LMs be of use in predicting the psychometricproperties of test items, when those items are given to human participants? Ifso, the benefit for psychometric practitioners is enormous, as it can reducethe need for multiple rounds of empirical testing. We gather responses fromnumerous human participants and LMs (transformer- and non-transformer-based) ona broad diagnostic test of linguistic competencies. We then use the humanresponses to calculate standard psychometric properties of the items in thediagnostic test, using the human responses and the LM responses separately. Wethen determine how well these two sets of predictions correlate. We find thattransformer-based LMs predict the human psychometric data consistently wellacross most categories, suggesting that they can be used to gather human-likepsychometric data without the need for extensive human trials.

Quick Read (beta)

loading the full paper ...