Abstract
Recent advances in Natural Language Processing (NLP) have led to thedevelopment of highly sophisticated language models for text generation. Inparallel, neuroscience has increasingly employed these models to explorecognitive processes involved in language comprehension. Previous research hasshown that models such as N-grams and LSTM networks can partially account forpredictability effects in explaining eye movement behaviors, specifically GazeDuration, during reading. In this study, we extend these findings by evaluatingtransformer-based models (GPT2, LLaMA-7B, and LLaMA2-7B) to further investigatethis relationship. Our results indicate that these architectures outperformearlier models in explaining the variance in Gaze Durations recorded fromRioplantense Spanish readers. However, similar to previous studies, thesemodels still fail to account for the entirety of the variance captured by humanpredictability. These findings suggest that, despite their advancements,state-of-the-art language models continue to predict language in ways thatdiffer from human readers.