Towards a Mathematics Formalisation Assistant using Large Language Models

Abstract

Mathematics formalisation is the task of writing mathematics (i.e.,definitions, theorem statements, proofs) in natural language, as found in booksand papers, into a formal language that can then be checked for correctness bya program. It is a thriving activity today, however formalisation remainscumbersome. In this paper, we explore the abilities of a large language model(Codex) to help with formalisation in the Lean theorem prover. We find thatwith careful input-dependent prompt selection and postprocessing, Codex is ableto formalise short mathematical statements at undergrad level with nearly 75\%accuracy for $120$ theorem statements. For proofs quantitative analysis isinfeasible and we undertake a detailed case study. We choose a diverse set of$13$ theorems at undergrad level with proofs that fit in two-three paragraphs.We show that with a new prompting strategy Codex can formalise these proofs innatural language with at least one out of twelve Codex completion being easy torepair into a complete proof. This is surprising as essentially no aligned dataexists for formalised mathematics, particularly for proofs. These resultssuggest that large language models are a promising avenue towards fully orpartially automating formalisation.

Quick Read (beta)

loading the full paper ...