Optimal PID and Antiwindup Control Design as a Reinforcement Learning Problem

Abstract

Deep reinforcement learning (DRL) has seen several successful applications toprocess control. Common methods rely on a deep neural network structure tomodel the controller or process. With increasingly complicated controlstructures, the closed-loop stability of such methods becomes less clear. Inthis work, we focus on the interpretability of DRL control methods. Inparticular, we view linear fixed-structure controllers as shallow neuralnetworks embedded in the actor-critic framework. PID controllers guide ourdevelopment due to their simplicity and acceptance in industrial practice. Wethen consider input saturation, leading to a simple nonlinear controlstructure. In order to effectively operate within the actuator limits we thenincorporate a tuning parameter for anti-windup compensation. Finally, thesimplicity of the controller allows for straightforward initialization. Thismakes our method inherently stabilizing, both during and after training, andamenable to known operational PID gains.

Quick Read (beta)

loading the full paper ...