Abstract
The problem-solving in automated theorem proving (ATP) can be interpreted asa search problem where the prover constructs a proof tree step by step. In thispaper, we propose a deep reinforcement learning algorithm for proof search inintuitionistic propositional logic. The most significant challenge in theapplication of deep learning to the ATP is the absence of large, public theoremdatabase. We, however, overcame this issue by applying a novel dataaugmentation procedure at each iteration of the reinforcement learning. We alsoimprove the efficiency of the algorithm by representing the syntactic structureof formulas by a novel compact graph representation. Using the large volume ofaugmented data, we train highly accurate graph neural networks that approximatethe value function for the set of the syntactic structures of formulas. Ourmethod is also cost-efficient in terms of computational time. We will show thatour prover outperforms Coq's $\texttt{tauto}$ tactic, a prover based onhuman-engineered heuristics. Within the specified time limit, our prover solved84% of the theorems in a benchmark library, while $\texttt{tauto}$ was able tosolve only 52%.