Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation

Abstract

Deep reinforcement learning (RL) has shown impressive results in a variety ofdomains, learning directly from high-dimensional sensory streams. However, whenneural networks are trained in a fixed environment, such as a single level in avideo game, they will usually overfit and fail to generalize to new levels.When RL models overfit, even slight modifications to the environment can resultin poor agent performance. This paper explores how procedurally generatedlevels during training can increase generality. We show that for some gamesprocedural level generation enables generalization to new levels within thesame distribution. Additionally, it is possible to achieve better performancewith less data by manipulating the difficulty of the levels in response to theperformance of the agent. The generality of the learned behaviors is alsoevaluated on a set of human-designed levels. The results suggest that theability to generalize to human-designed levels highly depends on the design ofthe level generators. We apply dimensionality reduction and clusteringtechniques to visualize the generators' distributions of levels and analyze towhat degree they can produce levels similar to those designed by a human.

Quick Read (beta)

loading the full paper ...