A Survey on Causal Reinforcement Learning

Abstract

While Reinforcement Learning (RL) achieves tremendous success in sequentialdecision-making problems of many domains, it still faces key challenges of datainefficiency and the lack of interpretability. Interestingly, many researchershave leveraged insights from the causality literature recently, bringing forthflourishing works to unify the merits of causality and address well thechallenges from RL. As such, it is of great necessity and significance tocollate these Causal Reinforcement Learning (CRL) works, offer a review of CRLmethods, and investigate the potential functionality from causality toward RL.In particular, we divide existing CRL approaches into two categories accordingto whether their causality-based information is given in advance or not. Wefurther analyze each category in terms of the formalization of differentmodels, ranging from the Markov Decision Process (MDP), Partially ObservedMarkov Decision Process (POMDP), Multi-Arm Bandits (MAB), and Dynamic TreatmentRegime (DTR). Moreover, we summarize the evaluation matrices and open sourceswhile we discuss emerging applications, along with promising prospects for thefuture development of CRL.

Quick Read (beta)

loading the full paper ...