fMRI predictors based on language models of increasing complexity recover brain left lateralization

Abstract

Over the past decade, studies of naturalistic language processing whereparticipants are scanned while listening to continuous text have flourished.Using word embeddings at first, then large language models, researchers havecreated encoding models to analyze the brain signals. Presenting these modelswith the same text as the participants allows to identify brain areas wherethere is a significant correlation between the functional magnetic resonanceimaging (fMRI) time series and the ones predicted by the models' artificialneurons. One intriguing finding from these studies is that they have revealedhighly symmetric bilateral activation patterns, somewhat at odds with thewell-known left lateralization of language processing. Here, we report analysesof an fMRI dataset where we manipulate the complexity of large language models,testing 28 pretrained models from 8 different families, ranging from 124M to14.2B parameters. First, we observe that the performance of models inpredicting brain responses follows a scaling law, where the fit with brainactivity increases linearly with the logarithm of the number of parameters ofthe model (and its performance on natural language processing tasks). Second,although this effect is present in both hemispheres, it is stronger in the leftthan in the right hemisphere. Specifically, the left-right difference in braincorrelation follows a scaling law with the number of parameters. This findingreconciles computational analyses of brain activity using large language modelswith the classic observation from aphasic patients showing left hemispheredominance for language.

Quick Read (beta)

loading the full paper ...