Generating Explanations from Deep Reinforcement Learning Using Episodic Memory

Abstract

Deep Reinforcement Learning (RL) involves the use of Deep Neural Networks(DNNs) to make sequential decisions in order to maximize reward. For many tasksthe resulting sequence of actions produced by a Deep RL policy can be long anddifficult to understand for humans. A crucial component of human explanationsis selectivity, whereby only key decisions and causes are recounted. ImbuingDeep RL agents with such an ability would make their resulting policies easierto understand from a human perspective and generate a concise set ofinstructions to aid the learning of future agents. To this end we use a Deep RLagent with an episodic memory system to identify and recount key decisionsduring policy execution. We show that these decisions form a short, humanreadable explanation that can also be used to speed up the learning of naiveDeep RL agents in an algorithm-independent manner.

Quick Read (beta)

loading the full paper ...