Abstract
Generalization is a major challenge for multi-agent reinforcement learning.How well does an agent perform when placed in novel environments and ininteractions with new co-players? In this paper, we investigate and quantifythe relationship between generalization and diversity in the multi-agentdomain. Across the range of multi-agent environments considered here,procedurally generating training levels significantly improves agentperformance on held-out levels. However, agent performance on the specificlevels used in training sometimes declines as a result. To better understandthe effects of co-player variation, our experiments introduce a newenvironment-agnostic measure of behavioral diversity. Results demonstrate thatpopulation size and intrinsic motivation are both effective methods ofgenerating greater population diversity. In turn, training with a diverse setof co-players strengthens agent performance in some (but not all) cases.