Learning of Generalizable and Interpretable Knowledge in Grid-Based Reinforcement Learning Environments

Abstract

Understanding the interactions of agents trained with deep reinforcementlearning is crucial for deploying agents in games or the real world. In theformer, unreasonable actions confuse players. In the latter, that effect iseven more significant, as unexpected behavior cause accidents with potentiallygrave and long-lasting consequences for the involved individuals. In this work,we propose using program synthesis to imitate reinforcement learning policiesafter seeing a trajectory of the action sequence. Programs have the advantagethat they are inherently interpretable and verifiable for correctness. We adaptthe state-of-the-art program synthesis system DreamCoder for learning conceptsin grid-based environments, specifically, a navigation task and two miniatureversions of Atari games, Space Invaders and Asterix. By inspecting thegenerated libraries, we can make inferences about the concepts the black-boxagent has learned and better understand the agent's behavior. We achieve thesame by visualizing the agent's decision-making process for the imitatedsequences. We evaluate our approach with different types of programsynthesizers based on a search-only method, a neural-guided search, and alanguage model fine-tuned on code.

Quick Read (beta)

loading the full paper ...