Reward Learning using Structural Motifs in Inverse Reinforcement Learning

Abstract

The Inverse Reinforcement Learning (\textit{IRL}) problem has seen rapidevolution in the past few years, with important applications in domains likerobotics, cognition, and health. In this work, we explore the inefficacy ofcurrent IRL methods in learning an agent's reward function from experttrajectories depicting long-horizon, complex sequential tasks. We hypothesizethat imbuing IRL models with structural motifs capturing underlying tasks canenable and enhance their performance. Subsequently, we propose a novel IRLmethod, SMIRL, that first learns the (approximate) structure of a task as afinite-state-automaton (FSA), then uses the structural motif to solve the IRLproblem. We test our model on both discrete grid world and high-dimensionalcontinuous domain environments. We empirically show that our proposed approachsuccessfully learns all four complex tasks, where two foundational IRLbaselines fail. Our model also outperforms the baselines in sample efficiencyon a simpler toy task. We further show promising test results in a modifiedcontinuous domain on tasks with compositional reward functions.

Quick Read (beta)

loading the full paper ...