A Deep Reinforcement Learning based Approach to Learning Transferable Proof Guidance Strategies

Abstract

Traditional first-order logic (FOL) reasoning systems usually rely on manualheuristics for proof guidance. We propose TRAIL: a system that learns toperform proof guidance using reinforcement learning. A key design principle ofour system is that it is general enough to allow transfer to problems indifferent domains that do not share the same vocabulary of the training set. Todo so, we developed a novel representation of the internal state of a prover interms of clauses and inference actions. We also propose a novel neural-basedattention mechanism to learn interactions between clauses. We demonstrate thatthis approach enables the system to generalize from training to test dataacross domains with different vocabularies, suggesting that the neuralarchitecture in TRAIL is well suited for representing and processing of logicalformalisms. We also show that TRAIL's learned strategies provide a comparableperformance to an established heuristics-based theorem prover.

Quick Read (beta)

loading the full paper ...