Recurrent World Models Facilitate Policy Evolution

Abstract

A generative recurrent neural network is quickly trained in an unsupervisedmanner to model popular reinforcement learning environments through compressedspatio-temporal representations. The world model's extracted features are fedinto compact and simple policies trained by evolution, achieving state of theart results in various environments. We also train our agent entirely inside ofan environment generated by its own internal world model, and transfer thispolicy back into the actual environment. Interactive version of paper athttps://worldmodels.github.io

Quick Read (beta)

loading the full paper ...