Dissociating language and thought in large language models: a cognitive perspective

Abstract

Today's large language models (LLMs) routinely generate coherent, grammaticaland seemingly meaningful paragraphs of text. This achievement has led tospeculation that these networks are -- or will soon become -- "thinkingmachines", capable of performing tasks that require abstract knowledge andreasoning. Here, we review the capabilities of LLMs by considering theirperformance on two different aspects of language use: 'formal linguisticcompetence', which includes knowledge of rules and patterns of a givenlanguage, and 'functional linguistic competence', a host of cognitive abilitiesrequired for language understanding and use in the real world. Drawing onevidence from cognitive neuroscience, we show that formal competence in humansrelies on specialized language processing mechanisms, whereas functionalcompetence recruits multiple extralinguistic capacities that comprise humanthought, such as formal reasoning, world knowledge, situation modeling, andsocial cognition. In line with this distinction, LLMs show impressive (althoughimperfect) performance on tasks requiring formal linguistic competence, butfail on many tests requiring functional competence. Based on this evidence, weargue that (1) contemporary LLMs should be taken seriously as models of formallinguistic skills; (2) models that master real-life language use would need toincorporate or develop not only a core language module, but also multiplenon-language-specific cognitive capacities required for modeling thought.Overall, a distinction between formal and functional linguistic competencehelps clarify the discourse surrounding LLMs' potential and provides a pathtoward building models that understand and use language in human-like ways.

Quick Read (beta)

loading the full paper ...