Solving nonconvex Hamilton--Jacobi--Isaacs equations with PINN-based policy iteration

Abstract

We propose a mesh-free policy iteration framework that combines classicaldynamic programming with physics-informed neural networks (PINNs) to solvehigh-dimensional, nonconvex Hamilton--Jacobi--Isaacs (HJI) equations arising instochastic differential games and robust control. The method alternates betweensolving linear second-order PDEs under fixed feedback policies and updating thecontrols via pointwise minimax optimization using automatic differentiation.Under standard Lipschitz and uniform ellipticity assumptions, we prove that thevalue function iterates converge locally uniformly to the unique viscositysolution of the HJI equation. The analysis establishes equi-Lipschitzregularity of the iterates, enabling provable stability and convergence withoutrequiring convexity of the Hamiltonian. Numerical experiments demonstrate theaccuracy and scalability of the method. In a two-dimensional stochasticpath-planning game with a moving obstacle, our method matches finite-differencebenchmarks with relative $L^2$-errors below %10^{-2}%. In five- andten-dimensional publisher-subscriber differential games with anisotropic noise,the proposed approach consistently outperforms direct PINN solvers, yieldingsmoother value functions and lower residuals. Our results suggest thatintegrating PINNs with policy iteration is a practical and theoreticallygrounded method for solving high-dimensional, nonconvex HJI equations, withpotential applications in robotics, finance, and multi-agent reinforcementlearning.

Quick Read (beta)

loading the full paper ...