Abstract
Deep reinforcement learning (DRL) has demonstrated impressive performance invarious gaming simulators and real-world applications. In practice, however, aDRL agent may receive faulty observation by abrupt interferences such asblack-out, frozen-screen, and adversarial perturbation. How to design aresilient DRL algorithm against these rare but mission-critical andsafety-crucial scenarios is an important yet challenging task. In this paper,we consider a generative DRL framework training with an auxiliary task ofobservational interferences such as artificial noises. Under this framework, wediscuss the importance of the causal relation and propose a causal inferencebased DRL algorithm called causal inference Q-network (CIQ). We evaluate theperformance of CIQ in several benchmark DRL environments with different typesof interferences as auxiliary labels. Our experimental results show that theproposed CIQ method could achieve higher performance and more resilienceagainst observational interferences.