The standard risk minimization paradigm of machine learning is brittle whenoperating in environments whose test distributions are different from thetraining distribution due to spurious correlations. Training on data from manyenvironments and finding invariant predictors reduces the effect of spuriousfeatures by concentrating models on features that have a causal relationshipwith the outcome. In this work, we pose such invariant risk minimization asfinding the Nash equilibrium of an ensemble game among several environments. Bydoing so, we develop a simple training algorithm that uses best responsedynamics and, in our experiments, yields similar or better empirical accuracywith much lower variance than the challenging bi-level optimization problem ofArjovsky et.al. (2019). One key theoretical contribution is showing that theset of Nash equilibria for the proposed game are equivalent to the set ofinvariant predictors for any finite number of environments, even with nonlinearclassifiers and transformations. As a result, our method also retains thegeneralization guarantees to a large set of environments shown in Arjovskyet.al. (2019). The proposed algorithm adds to the collection of successfulgame-theoretic machine learning algorithms such as generative adversarialnetworks.