A Prescriptive Dirichlet Power Allocation Policy with Deep Reinforcement Learning

  • 2022-01-20 20:41:04
  • Yuan Tian, Minghao Han, Chetan Kulkarni, Olga Fink
  • 1

Abstract

Prescribing optimal operation based on the condition of the system and,thereby, potentially prolonging the remaining useful lifetime has a largepotential for actively managing the availability, maintenance and costs ofcomplex systems. Reinforcement learning (RL) algorithms are particularlysuitable for this type of problems given their learning capabilities. A specialcase of a prescriptive operation is the power allocation task, which can beconsidered as a sequential allocation problem, where the action space isbounded by a simplex constraint. A general continuous action-space solution ofsuch sequential allocation problems has still remained an open researchquestion for RL algorithms. In continuous action-space, the standard Gaussianpolicy applied in reinforcement learning does not support simplex constraints,while the Gaussian-softmax policy introduces a bias during training. In thiswork, we propose the Dirichlet policy for continuous allocation tasks andanalyze the bias and variance of its policy gradients. We demonstrate that theDirichlet policy is bias-free and provides significantly faster convergence,better performance and better hyperparameters robustness over theGaussian-softmax policy. Moreover, we demonstrate the applicability of theproposed algorithm on a prescriptive operation case, where we propose theDirichlet power allocation policy and evaluate the performance on a case studyof a set of multiple lithium-ion (Li-I) battery systems. The experimentalresults show the potential to prescribe optimal operation, improve theefficiency and sustainability of multi-power source systems.

 

Quick Read (beta)

loading the full paper ...