MoET: Interpretable and Verifiable Reinforcement Learning via Mixture of Expert Trees

Abstract

Deep Reinforcement Learning (DRL) has led to many recent breakthroughs oncomplex control tasks, such as defeating the best human player in the game ofGo. However, decisions made by the DRL agent are not explainable, hindering itsapplicability in safety-critical settings. Viper, a recently proposedtechnique, constructs a decision tree policy by mimicking the DRL agent.Decision trees are interpretable as each action made can be traced back to thedecision rule path that lead to it. However, one global decision treeapproximating the DRL policy has significant limitations with respect to thegeometry of decision boundaries. We propose MoET, a more expressive, yet stillinterpretable model based on Mixture of Experts, consisting of a gatingfunction that partitions the state space, and multiple decision tree expertsthat specialize on different partitions. We propose a training procedure tosupport non-differentiable decision tree experts and integrate it intoimitation learning procedure of Viper. We evaluate our algorithm on four OpenAIgym environments, and show that the policy constructed in such a way is moreperformant and better mimics the DRL agent by lowering mispredictions andincreasing the reward. We also show that MoET policies are amenable forverification using off-the-shelf automated theorem provers such as Z3.

Quick Read (beta)

loading the full paper ...