Meta-Adversarial Inverse Reinforcement Learning for Decision-making Tasks

Abstract

Learning from demonstrations has made great progress over the past few years.However, it is generally data hungry and task specific. In other words, itrequires a large amount of data to train a decent model on a particular task,and the model often fails to generalize to new tasks that have a differentdistribution. In practice, demonstrations from new tasks will be continuouslyobserved and the data might be unlabeled or only partially labeled. Therefore,it is desirable for the trained model to adapt to new tasks that have limiteddata samples available. In this work, we build an adaptable imitation learningmodel based on the integration of Meta-learning and Adversarial InverseReinforcement Learning (Meta-AIRL). We exploit the adversarial learning andinverse reinforcement learning mechanisms to learn policies and rewardfunctions simultaneously from available training tasks and then adapt them tonew tasks with the meta-learning framework. Simulation results show that theadapted policy trained with Meta-AIRL can effectively learn from limited numberof demonstrations, and quickly reach the performance comparable to that of theexperts on unseen tasks.

Quick Read (beta)

loading the full paper ...