Benchmarking Constraint Inference in Inverse Reinforcement Learning

Abstract

When deploying Reinforcement Learning (RL) agents into a physical system, wemust ensure that these agents are well aware of the underlying constraints. Inmany real-world problems, however, the constraints followed by expert agents(e.g., humans) are often hard to specify mathematically and unknown to the RLagents. To tackle these issues, Constraint Inverse Reinforcement Learning(CIRL) considers the formalism of Constrained Markov Decision Processes (CMDPs)and estimates constraints from expert demonstrations by learning a constraintfunction. As an emerging research topic, CIRL does not have common benchmarks,and previous works tested their algorithms with hand-crafted environments(e.g., grid worlds). In this paper, we construct a CIRL benchmark in thecontext of two major application domains: robot control and autonomous driving.We design relevant constraints for each environment and empirically study theability of different algorithms to recover those constraints based on experttrajectories that respect those constraints. To handle stochastic dynamics, wepropose a variational approach that infers constraint distributions, and wedemonstrate its performance by comparing it with other CIRL baselines on ourbenchmark. The benchmark, including the information for reproducing theperformance of CIRL algorithms, is publicly available athttps://github.com/Guiliang/CIRL-benchmarks-public

Quick Read (beta)

loading the full paper ...