Proximal Regret and Proximal Correlated Equilibria: A New Tractable Solution Concept for Online Learning and Games

Abstract

Learning and computation of equilibria are central problems in game theory,theory of computation, and artificial intelligence. In this work, we introduceproximal regret, a new notion of regret based on proximal operators that liesstrictly between external and swap regret. When every player employs ano-proximal-regret algorithm in a general convex game, the empiricaldistribution of play converges to proximal correlated equilibria (PCE), arefinement of coarse correlated equilibria. Our framework unifies severalemerging notions in online learning and game theory-such as gradientequilibrium and semicoarse correlated equilibrium-and introduces new ones. Ourmain result shows that the classic Online Gradient Descent (GD) algorithmachieves an optimal $O(\sqrt{T})$ bound on proximal regret, revealing that GD,without modification, minimizes a stronger regret notion than external regret.This provides a new explanation for the empirically superior performance ofgradient descent in online learning and games. We further extend our analysisto Mirror Descent in the Bregman setting and to Optimistic Gradient Descent,which yields faster convergence in smooth convex games.

Quick Read (beta)

loading the full paper ...