Abstract
Recent research on reinforcement learning in pure-conflict and pure-commoninterest games has emphasized the importance of population heterogeneity. Incontrast, studies of reinforcement learning in mixed-motive games haveprimarily leveraged homogeneous approaches. Given the defining characteristicof mixed-motive games--the imperfect correlation of incentives between groupmembers--we study the effect of population heterogeneity on mixed-motivereinforcement learning. We draw on interdependence theory from socialpsychology and imbue reinforcement learning agents with Social ValueOrientation (SVO), a flexible formalization of preferences over group outcomedistributions. We subsequently explore the effects of diversity in SVO onpopulations of reinforcement learning agents in two mixed-motive Markov games.We demonstrate that heterogeneity in SVO generates meaningful and complexbehavioral variation among agents similar to that suggested by interdependencetheory. Empirical results in these mixed-motive dilemmas suggest agents trainedin heterogeneous populations develop particularly generalized, high-performingpolicies relative to those trained in homogeneous populations.