Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

Abstract

Advances in the field of inverse reinforcement learning (IRL) have led tosophisticated inference frameworks that relax the original modeling assumptionof observing an agent behavior that reflects only a single intention. Insteadof learning a global behavioral model, recent IRL methods divide thedemonstration data into parts, to account for the fact that differenttrajectories may correspond to different intentions, e.g., because they weregenerated by different domain experts. In this work, we go one step further:using the intuitive concept of subgoals, we build upon the premise that even asingle trajectory can be explained more efficiently locally within a certaincontext than globally, enabling a more compact representation of the observedbehavior. Based on this assumption, we build an implicit intentional model ofthe agent's goals to forecast its behavior in unobserved situations. The resultis an integrated Bayesian prediction framework that significantly outperformsexisting IRL solutions and provides smooth policy estimates consistent with theexpert's plan. Most notably, our framework naturally handles situations wherethe intentions of the agent change over time and classical IRL algorithms fail.In addition, due to its probabilistic nature, the model can bestraightforwardly applied in active learning scenarios to guide thedemonstration process of the expert.

Quick Read (beta)

loading the full paper ...