Neural Logic Reinforcement Learning

Abstract

Deep reinforcement learning (DRL) has achieved significant breakthroughs invarious tasks. However, most DRL algorithms suffer a problem of generalizingthe learned policy which makes the learning performance largely affected evenby minor modifications of the training environment. Except that, the use ofdeep neural networks makes the learned policies hard to be interpretable. Toaddress these two challenges, we propose a novel algorithm named Neural LogicReinforcement Learning (NLRL) to represent the policies in reinforcementlearning by first-order logic. NLRL is based on policy gradient methods anddifferentiable inductive logic programming that have demonstrated significantadvantages in terms of interpretability and generalisability in supervisedtasks. Extensive experiments conducted on cliff-walking and blocks manipulationtasks demonstrate that NLRL can induce interpretable policies achievingnear-optimal performance while demonstrating good generalisability toenvironments of different initial states and problem sizes.

Quick Read (beta)

loading the full paper ...