Illuminating Generalization in Deep Reinforcement Learning through Procedural Level Generation

Abstract

Deep reinforcement learning (RL) has shown impressive results in a variety ofdomains, learning directly from high-dimensional sensory streams. However, whenneural networks are trained in a fixed environment, such as a single level in avideo game, they will usually overfit and fail to generalize to new levels.When RL models overfit, even slight modifications to the environment can resultin poor agent performance. In this paper, we explore how procedurally generatedlevels during training increase generality. We show that for some gamesprocedural level generation enables generalization to new levels within thesame distribution. Additionally, it is possible to achieve better performancewith less data by manipulating the difficulty of the levels in response to theperformance of the agent. The generality of the learned behaviors is alsoevaluated on a set of human-designed levels. Our results show that the abilityto generalize to human-designed levels highly depends on the design of thelevel generators. We apply dimensionality reduction and clustering techniquesto visualize the generators' distributions of levels and analyze to what degreethey can produce levels similar to those designed by a human.

Quick Read (beta)

loading the full paper ...