Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management

Abstract

Many optimal control problems require the simultaneous output of discrete andcontinuous control variables. These problems are usually formulated asmixed-integer optimal control (MIOC) problems, which are challenging to solvedue to the complexity of the solution space. Numerical methods such asbranch-and-bound are computationally expensive and undesirable for real-timecontrol. This paper proposes a novel hybrid-action reinforcement learning(HARL) algorithm, twin delayed deep deterministic actor-Q (TD3AQ), for MIOCproblems. TD3AQ combines the advantages of both actor-critic and Q-learningmethods, and can handle the discrete and continuous action spacessimultaneously. The proposed algorithm is evaluated on a plug-in hybridelectric vehicle (PHEV) energy management problem, where real-time control ofthe discrete variables, clutch engagement/disengagement and gear shift, andcontinuous variable, engine torque, is essential to maximize fuel economy whilesatisfying driving constraints. Simulation outcomes demonstrate that TD3AQachieves control results close to optimality when compared with dynamicprogramming (DP), with just 4.69% difference. Furthermore, it surpasses theperformance of baseline reinforcement learning algorithms.

Quick Read (beta)

loading the full paper ...