Abstract
This paper presents Grammar Reinforcement Learning (GRL), a reinforcementlearning algorithm that uses Monte Carlo Tree Search (MCTS) and a transformerarchitecture that models a Pushdown Automaton (PDA) within a context-freegrammar (CFG) framework. Taking as use case the problem of efficiently countingpaths and cycles in graphs, a key challenge in network analysis, computerscience, biology, and social sciences, GRL discovers new matrix-based formulasfor path/cycle counting that improve computational efficiency by factors of twoto six w.r.t state-of-the-art approaches. Our contributions include: (i) aframework for generating gramformers that operate within a CFG, (ii) thedevelopment of GRL for optimizing formulas within grammatical structures, and(iii) the discovery of novel formulas for graph substructure counting, leadingto significant computational improvements.