Inverse Reinforcement Learning from Summary Data

  • 2018-06-17 06:46:22
  • Antti Kangasrääsiö, Samuel Kaski
  • 0

Abstract

Inverse reinforcement learning (IRL) aims to explain observed strategicbehavior by fitting reinforcement learning models to behavioral data. However,traditional IRL methods are only applicable when the observations are in theform of state-action paths. This assumption may not hold in many real-worldmodeling settings, where only partial or summarized observations are available.In general, we may assume that there is a summarizing function $\sigma$, whichacts as a filter between us and the true state-action paths that constitute thedemonstration. Some initial approaches to extending IRL to such situations havebeen presented, but with very specific assumptions about the structure of$\sigma$, such as that only certain state observations are missing. This paperinstead focuses on the most general case of the problem, where no assumptionsare made about the summarizing function, except that it can be evaluated. Wedemonstrate that inference is still possible. The paper presents exact andapproximate inference algorithms that allow full posterior inference, which isparticularly important for assessing parameter uncertainty in this challenginginference situation. Empirical scalability is demonstrated to reasonably sizedproblems, and practical applicability is demonstrated by estimating theposterior for a cognitive science RL model based on an observed user's taskcompletion time only.

 

Quick Read (beta)

loading the full paper ...