Language Design and Renormalization

Abstract

Here we consider some well-known facts in syntax from a physics perspective,which allows us to establish some remarkable equivalences. Specifically, weobserve that the operation MERGE put forward by N. Chomsky in 1995 can beinterpreted as a physical information coarse-graining. Thus, MERGE inlinguistics entails information renormalization in physics, according todifferent time scales. We make this point mathematically formal in terms oflanguage models, i.e., probability distributions over word sequences, widelyused in natural language processing as well as other ambits. In this setting,MERGE corresponds to a 3-index probability tensor implementing acoarse-graining, akin to a probabilistic context-free grammar. The probabilityvectors of meaningful sentences are naturally given by stochastic tensornetworks (TN) that are mostly loop-free, such as Tree Tensor Networks andMatrix Product States. These structures have short-ranged correlations in thesyntactic distance by construction and, because of the peculiarities of humanlanguage, they are extremely efficient to manipulate computationally. We alsopropose how to obtain such language models from probability distributions ofcertain TN quantum states, which we show to be efficiently preparable by aquantum computer. Moreover, using tools from entanglement theory, we use thesequantum states to prove classical lower bounds on the perplexity of theprobability distribution for a set of words in a sentence. Implications ofthese results are discussed in the ambits of theoretical and computationallinguistics, artificial intelligence, programming languages, RNA and proteinsequencing, quantum many-body systems, and beyond. Our work shows how many ofthe key linguistic ideas from the last century, including developments incomputational linguistics, fit perfectly with known physical concepts linked torenormalization.

Quick Read (beta)

loading the full paper ...