Unsupervised Acquisition of Discrete Grammatical Categories

  • 2025-03-24 15:15:08
  • David Ph. Shakouri, Crit Cremers, Niels O. Schiller
  • 0

Abstract

This article presents experiments performed using a computational laboratoryenvironment for language acquisition experiments. It implements a multi-agentsystem consisting of two agents: an adult language model and a daughterlanguage model that aims to learn the mother language. Crucially, the daughteragent does not have access to the internal knowledge of the mother languagemodel but only to the language exemplars the mother agent generates. Theseexperiments illustrate how this system can be used to acquire abstractgrammatical knowledge. We demonstrate how statistical analyses of patterns inthe input data corresponding to grammatical categories yield discretegrammatical rules. These rules are subsequently added to the grammaticalknowledge of the daughter language model. To this end, hierarchicalagglomerative cluster analysis was applied to the utterances consecutivelygenerated by the mother language model. It is argued that this procedure can beused to acquire structures resembling grammatical categories proposed bylinguists for natural languages. Thus, it is established that non-trivialgrammatical knowledge has been acquired. Moreover, the parameter configurationof this computational laboratory environment determined using training datagenerated by the mother language model is validated in a second experiment witha test set similarly resulting in the acquisition of non-trivial categories.

 

Quick Read (beta)

loading the full paper ...