Explainable Post hoc Portfolio Management Financial Policy of a Deep Reinforcement Learning agent

  • 2024-07-19 18:40:39
  • Alejandra de la Rica Escudero, Eduardo C. Garrido-Merchan, Maria Coronado-Vaca
  • 0

Abstract

Financial portfolio management investment policies computed quantitatively bymodern portfolio theory techniques like the Markowitz model rely on a set onassumptions that are not supported by data in high volatility markets. Hence,quantitative researchers are looking for alternative models to tackle thisproblem. Concretely, portfolio management is a problem that has beensuccessfully addressed recently by Deep Reinforcement Learning (DRL)approaches. In particular, DRL algorithms train an agent by estimating thedistribution of the expected reward of every action performed by an agent givenany financial state in a simulator. However, these methods rely on Deep NeuralNetworks model to represent such a distribution, that although they areuniversal approximator models, they cannot explain its behaviour, given by aset of parameters that are not interpretable. Critically, financial investorspolicies require predictions to be interpretable, so DRL agents are not suitedto follow a particular policy or explain their actions. In this work, wedeveloped a novel Explainable Deep Reinforcement Learning (XDRL) approach forportfolio management, integrating the Proximal Policy Optimization (PPO) withthe model agnostic explainable techniques of feature importance, SHAP and LIMEto enhance transparency in prediction time. By executing our methodology, wecan interpret in prediction time the actions of the agent to assess whetherthey follow the requisites of an investment policy or to assess the risk offollowing the agent suggestions. To the best of our knowledge, our proposedapproach is the first explainable post hoc portfolio management financialpolicy of a DRL agent. We empirically illustrate our methodology bysuccessfully identifying key features influencing investment decisions, whichdemonstrate the ability to explain the agent actions in prediction time.

 

Quick Read (beta)

loading the full paper ...