Scaling laws for language encoding models in fMRI

Abstract

Representations from transformer-based unidirectional language models areknown to be effective at predicting brain responses to natural language.However, most studies comparing language models to brains have used GPT-2 orsimilarly sized language models. Here we tested whether larger open-sourcemodels such as those from the OPT and LLaMA families are better at predictingbrain responses recorded using fMRI. Mirroring scaling results from othercontexts, we found that brain prediction performance scales log-linearly withmodel size from 125M to 30B parameter models, with ~15% increased encodingperformance as measured by correlation with a held-out test set across 3subjects. Similar log-linear behavior was observed when scaling the size of thefMRI training set. We also characterized scaling for acoustic encoding modelsthat use HuBERT, WavLM, and Whisper, and we found comparable improvements withmodel size. A noise ceiling analysis of these large, high-performance encodingmodels showed that performance is nearing the theoretical maximum for brainareas such as the precuneus and higher auditory cortex. These results suggestthat increasing scale in both models and data will yield incredibly effectivemodels of language processing in the brain, enabling better scientificunderstanding as well as applications such as decoding.

Quick Read (beta)

loading the full paper ...