Hierarchical Reinforcement Learning for Optimal Control of Linear Multi-Agent Systems: the Homogeneous Case

Abstract

Individual agents in a multi-agent system (MAS) may have decoupled open-loopdynamics, but a cooperative control objective usually results in coupledclosed-loop dynamics thereby making the control design computationallyexpensive. The computation time becomes even higher when a learning strategysuch as reinforcement learning (RL) needs to be applied to deal with thesituation when the agents dynamics are not known. To resolve this problem, thispaper proposes a hierarchical RL scheme for a linear quadratic regulator (LQR)design in a continuous-time linear MAS. The idea is to exploit the structuralproperties of two graphs embedded in the $Q$ and $R$ weighting matrices in theLQR objective to define an orthogonal transformation that can convert theoriginal LQR design to multiple decoupled smaller-sized LQR designs. We showthat if the MAS is homogeneous then this decomposition retains closed-loopoptimality. Conditions for decomposability, an algorithm for constructing thetransformation matrix, a hierarchical RL algorithm, and robustness analysiswhen the design is applied to non-homogeneous MAS are presented. Simulationsshow that the proposed approach can guarantee significant speed-up in learningwithout any loss in the cumulative value of the LQR cost.

Quick Read (beta)

loading the full paper ...