Memory Augmented Large Language Models are Computationally Universal

Abstract

We show that transformer-based large language models are computationallyuniversal when augmented with an external memory. Any deterministic languagemodel that conditions on strings of bounded length is equivalent to a finiteautomaton, hence computationally limited. However, augmenting such models witha read-write memory creates the possibility of processing arbitrarily largeinputs and, potentially, simulating any algorithm. We establish that anexisting large language model, Flan-U-PaLM 540B, can be combined with anassociative read-write memory to exactly simulate the execution of a universalTuring machine, $U_{15,2}$. A key aspect of the finding is that it does notrequire any modification of the language model weights. Instead, theconstruction relies solely on designing a form of stored instruction computerthat can subsequently be programmed with a specific set of prompts.

Quick Read (beta)

loading the full paper ...