The Effect of Surprisal on Reading Times in Information Seeking and Repeated Reading

Abstract

The effect of surprisal on processing difficulty has been a central topic ofinvestigation in psycholinguistics. Here, we use eyetracking data to examinethree language processing regimes that are common in daily life but have notbeen addressed with respect to this question: information seeking, repeatedprocessing, and the combination of the two. Using standard regime-agnosticsurprisal estimates we find that the prediction of surprisal theory regardingthe presence of a linear effect of surprisal on processing times, extends tothese regimes. However, when using surprisal estimates from regime-specificcontexts that match the contexts and tasks given to humans, we find that ininformation seeking, such estimates do not improve the predictive power ofprocessing times compared to standard surprisals. Further, regime-specificcontexts yield near zero surprisal estimates with no predictive power forprocessing times in repeated reading. These findings point to misalignments oftask and memory representations between humans and current language models, andquestion the extent to which such models can be used for estimating cognitivelyrelevant quantities. We further discuss theoretical challenges posed by theseresults.

Quick Read (beta)

loading the full paper ...