Active Use of Latent Constituency Representation in both Humans and Large Language Models

Abstract

Understanding how sentences are internally represented in the human brain, aswell as in large language models (LLMs) such as ChatGPT, is a major challengefor cognitive science. Classic linguistic theories propose that the brainrepresents a sentence by parsing it into hierarchically organized constituents.In contrast, LLMs do not explicitly parse linguistic constituents and theirlatent representations remains poorly explained. Here, we demonstrate thathumans and LLMs construct similar latent representations of hierarchicallinguistic constituents by analyzing their behaviors during a novel one-shotlearning task, in which they infer which words should be deleted from asentence. Both humans and LLMs tend to delete a constituent, instead of anonconstituent word string. In contrast, a naive sequence processing model thathas access to word properties and ordinal positions does not show thisproperty. Based on the word deletion behaviors, we can reconstruct the latentconstituency tree representation of a sentence for both humans and LLMs. Theseresults demonstrate that a latent tree-structured constituency representationcan emerge in both the human brain and LLMs.

Quick Read (beta)

loading the full paper ...