Abstract
Unit commitment (UC) is a fundamental problem in the day-ahead electricitymarket, and it is critical to solve UC problems efficiently. Mathematicaloptimization techniques like dynamic programming, Lagrangian relaxation, andmixed-integer quadratic programming (MIQP) are commonly adopted for UCproblems. However, the calculation time of these methods increases at anexponential rate with the amount of generators and energy resources, which isstill the main bottleneck in industry. Recent advances in artificialintelligence have demonstrated the capability of reinforcement learning (RL) tosolve UC problems. Unfortunately, the existing research on solving UC problemswith RL suffers from the curse of dimensionality when the size of UC problemsgrows. To deal with these problems, we propose an optimization method-assistedensemble deep reinforcement learning algorithm, where UC problems areformulated as a Markov Decision Process (MDP) and solved by multi-step deepQ-learning in an ensemble framework. The proposed algorithm establishes acandidate action set by solving tailored optimization problems to ensure arelatively high performance and the satisfaction of operational constraints.Numerical studies on IEEE 118 and 300-bus systems show that our algorithmoutperforms the baseline RL algorithm and MIQP. Furthermore, the proposedalgorithm shows strong generalization capacity under unforeseen operationalconditions.