Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

Abstract

Deep Reinforcement Learning (DRL) has achieved impressive success in manyapplications. A key component of many DRL models is a neural networkrepresenting a Q function, to estimate the expected cumulative reward followinga state-action pair. The Q function neural network contains a lot of implicitknowledge about the RL problems, but often remains unexamined anduninterpreted. To our knowledge, this work develops the first mimic learningframework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) toapproximate neural network predictions. An LMUT is learned using a novelon-line algorithm that is well-suited for an active play setting, where themimic learner observes an ongoing interaction between the neural net and theenvironment. Empirical evaluation shows that an LMUT mimics a Q functionsubstantially better than five baseline methods. The transparent tree structureof an LMUT facilitates understanding the network's learned knowledge byanalyzing feature influence, extracting rules, and highlighting thesuper-pixels in image inputs.

Quick Read (beta)

loading the full paper ...