Active Learning for Risk-Sensitive Inverse Reinforcement Learning

Abstract

One typical assumption in inverse reinforcement learning (IRL) is that humanexperts act to optimize the expected utility of a stochastic cost with a fixeddistribution. This assumption deviates from actual human behaviors underambiguity. Risk-sensitive inverse reinforcement learning (RS-IRL) bridges suchgap by assuming that humans act according to a random cost with respect to aset of subjectively distorted distributions instead of a fixed one. Suchassumption provides the additional flexibility to model human's riskpreferences, represented by a risk envelope, in safe-critical tasks. However,like other learning from demonstration techniques, RS-IRL could also sufferinefficient learning due to redundant demonstrations. Inspired by the conceptof active learning, this research derives a probabilistic disturbance samplingscheme to enable an RS-IRL agent to query expert support that is likely toexpose unrevealed boundaries of the expert's risk envelope. Experimentalresults confirm that our approach accelerates the convergence of RS-IRLalgorithms with lower variance while still guaranteeing unbiased convergence.

Quick Read (beta)

loading the full paper ...