Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning

Abstract

Reinforcement learning has seen great advancements in the past five years.The successful introduction of deep learning in place of more traditionalmethods allowed reinforcement learning to scale to very complex domainsachieving super-human performance in environments like the game of Go ornumerous video games. Despite great successes in multiple domains, these newmethods suffer from their own issues that make them often inapplicable to thereal world problems. Extreme lack of data efficiency, together with hugevariance and difficulty in enforcing safety constraints, is one of the threemost prominent issues in the field. Usually, millions of data points sampledfrom the environment are necessary for these algorithms to converge toacceptable policies. This thesis proposes novel Generative Adversarial Imaginative ReinforcementLearning algorithm. It takes advantage of the recent introduction of highlyeffective generative adversarial models, and Markov property that underpinsreinforcement learning setting, to model dynamics of the real environmentwithin the internal imagination module. Rollouts from the imagination are thenused to artificially simulate the real environment in a standard reinforcementlearning process to avoid, often expensive and dangerous, trial and error inthe real environment. Experimental results show that the proposed algorithmmore economically utilises experience from the real environment than thecurrent state-of-the-art Rainbow DQN algorithm, and thus makes an importantstep towards sample efficient deep reinforcement learning.

Quick Read (beta)

loading the full paper ...