Subword models struggle with word learning, but surprisal hides it

Abstract

We study word learning in subword and character language models with thepsycholinguistic lexical decision task. While subword LMs struggle to discernwords and non-words with high accuracy, character LMs solve this task easilyand consistently. Only when supplied with further contexts do subword LMsperform similarly to character models. Additionally, when looking at word-leveland syntactic learning trajectories, we find that both processes are separablein character LMs. Word learning happens before syntactic learning, whereas bothoccur simultaneously in subword LMs. This raises questions about the adequacyof subword LMs for modeling language acquisition and positions character LMs asa viable alternative to study processes below the syntactic level.

Quick Read (beta)

loading the full paper ...