Towards Sample-Efficiency and Generalization of Transfer and Inverse Reinforcement Learning: A Comprehensive Literature Review

Abstract

Reinforcement learning (RL) is a sub-domain of machine learning, mainlyconcerned with solving sequential decision-making problems by a learning agentthat interacts with the decision environment to improve its behavior throughthe reward it receives from the environment. This learning paradigm is,however, well-known for being time-consuming due to the necessity of collectinga large amount of data, making RL suffer from sample inefficiency and difficultgeneralization. Furthermore, the construction of an explicit reward functionthat accounts for the trade-off between multiple desiderata of a decisionproblem is often a laborious task. These challenges have been recentlyaddressed utilizing transfer and inverse reinforcement learning (T-IRL). Inthis regard, this paper is devoted to a comprehensive review of realizing thesample efficiency and generalization of RL algorithms through T-IRL. Followinga brief introduction to RL, the fundamental T-IRL methods are presented and themost recent advancements in each research field have been extensively reviewed.Our findings denote that a majority of recent research works have dealt withthe aforementioned challenges by utilizing human-in-the-loop and sim-to-realstrategies for the efficient transfer of knowledge from source domains to thetarget domain under the transfer learning scheme. Under the IRL structure,training schemes that require a low number of experience transitions andextension of such frameworks to multi-agent and multi-intention problems havebeen the priority of researchers in recent years.

Quick Read (beta)

loading the full paper ...