Abstract
In this paper, we discuss how pure mathematics and theoretical physics can beapplied to the study of language models. Using set theory and analysis, weformulate mathematically rigorous definitions of language models, and introducethe concept of the moduli space of distributions for a language model. Weformulate a generalized distributional hypothesis using functional analysis andtopology. We define the entropy function associated with a language model andshow how it allows us to understand many interesting phenomena in languages. Weargue that the zero points of the entropy function and the points where theentropy is close to 0 are the key obstacles for an LLM to approximate anintelligent language model, which explains why good LLMs need billions ofparameters. Using the entropy function, we formulate a conjecture about AGI. Then, we show how thermodynamics gives us an immediate interpretation tolanguage models. In particular we will define the concepts of partitionfunction, internal energy and free energy for a language model, which offerinsights into how language models work. Based on these results, we introduce ageneral concept of the geometrization of language models and define what iscalled the Boltzmann manifold. While the current LLMs are the special cases ofthe Boltzmann manifold.