Task-Relevant Adversarial Imitation Learning

Abstract

We show that a critical problem in adversarial imitation fromhigh-dimensional sensory data is the tendency of discriminator networks todistinguish agent and expert behaviour using task-irrelevant features beyondthe control of the agent. We analyze this problem in detail and propose asolution as well as several baselines that outperform standard GenerativeAdversarial Imitation Learning (GAIL). Our proposed solution, Task-RelevantAdversarial Imitation Learning (TRAIL), uses a constrained optimizationobjective to overcome task-irrelevant features. Comprehensive experiments showthat TRAIL can solve challenging manipulation tasks from pixels by imitatinghuman operators, where other agents such as behaviour cloning (BC), standardGAIL, improved GAIL variants including our newly proposed baselines, andDeterministic Policy Gradients from Demonstrations (DPGfD) fail to findsolutions, even when the other agents have access to task reward.

Quick Read (beta)

loading the full paper ...