Abstract
This paper presents a novel approach combining inductive logic programmingwith reinforcement learning to improve training performance and explainability.We exploit inductive learning of answer set programs from noisy examples tolearn a set of logical rules representing an explainable approximation of theagent policy at each batch of experience. We then perform answer set reasoningon the learned rules to guide the exploration of the learning agent at the nextbatch, without requiring inefficient reward shaping and preserving optimalitywith soft bias. The entire procedure is conducted during the online executionof the reinforcement learning algorithm. We preliminarily validate the efficacyof our approach by integrating it into the Q-learning algorithm for the Pac-Manscenario in two maps of increasing complexity. Our methodology produces asignificant boost in the discounted return achieved by the agent, even in thefirst batches of training. Moreover, inductive learning does not compromise thecomputational time required by Q-learning and learned rules quickly converge toan explanation of the agent policy.