Finding Structure in Language Models

Abstract

When we speak, write or listen, we continuously make predictions based on ourknowledge of a language's grammar. Remarkably, children acquire thisgrammatical knowledge within just a few years, enabling them to understand andgeneralise to novel constructions that have never been uttered before. Languagemodels are powerful tools that create representations of language byincrementally predicting the next word in a sentence, and they have had atremendous societal impact in recent years. The central research question ofthis thesis is whether these models possess a deep understanding of grammaticalstructure similar to that of humans. This question lies at the intersection ofnatural language processing, linguistics, and interpretability. To address it,we will develop novel interpretability techniques that enhance ourunderstanding of the complex nature of large-scale language models. We approachour research question from three directions. First, we explore the presence ofabstract linguistic information through structural priming, a key paradigm inpsycholinguistics for uncovering grammatical structure in human languageprocessing. Next, we examine various linguistic phenomena, such as adjectiveorder and negative polarity items, and connect a model's comprehension of thesephenomena to the data distribution on which it was trained. Finally, weintroduce a controlled testbed for studying hierarchical structure in languagemodels using various synthetic languages of increasing complexity and examinethe role of feature interactions in modelling this structure. Our findingsoffer a detailed account of the grammatical knowledge embedded in languagemodel representations and provide several directions for investigatingfundamental linguistic questions using computational methods.

Quick Read (beta)

loading the full paper ...