Steady-State Error Compensation for Reinforcement Learning with Quadratic Rewards

Abstract

The selection of a reward function in Reinforcement Learning (RL) hasgarnered significant attention because of its impact on system performance.Issues of significant steady-state errors often manifest when quadratic rewardfunctions are employed. Although absolute-value-type reward functions alleviatethis problem, they tend to induce substantial fluctuations in specific systemstates, leading to abrupt changes. In response to this challenge, this studyproposes an approach that introduces an integral term. By integrating thisintegral term into quadratic-type reward functions, the RL algorithm is adeptlytuned, augmenting the system's consideration of reward history, andconsequently alleviates concerns related to steady-state errors. Throughexperiments and performance evaluations on the Adaptive Cruise Control (ACC)and lane change models, we validate that the proposed method effectivelydiminishes steady-state errors and does not cause significant spikes in somesystem states.

Quick Read (beta)

loading the full paper ...