Abstract
Recent advancements in machine learning based energy management approaches,specifically reinforcement learning with a safety layer (OptLayerPolicy) and ametaheuristic algorithm generating a decision tree control policy (TreeC), haveshown promise. However, their effectiveness has only been demonstrated incomputer simulations. This paper presents the real-world validation of thesemethods, comparing against model predictive control and simple rule-basedcontrol benchmark. The experiments were conducted on the electricalinstallation of 4 reproductions of residential houses, which all have their ownbattery, photovoltaic and dynamic load system emulating a non-controllableelectrical load and a controllable electric vehicle charger. The results showthat the simple rules, TreeC, and model predictive control-based methodsachieved similar costs, with a difference of only 0.6%. The reinforcementlearning based method, still in its training phase, obtained a cost 25.5\%higher to the other methods. Additional simulations show that the costs can befurther reduced by using a more representative training dataset for TreeC andaddressing errors in the model predictive control implementation caused by itsreliance on accurate data from various sources. The OptLayerPolicy safety layerallows safe online training of a reinforcement learning agent in thereal-world, given an accurate constraint function formulation. The proposedsafety layer method remains error-prone, nonetheless, it is found beneficialfor all investigated methods. The TreeC method, which does require building arealistic simulation for training, exhibits the safest operational performance,exceeding the grid limit by only 27.1 Wh compared to 593.9 Wh for reinforcementlearning.