Abstract
An open research question in deep reinforcement learning is how to focus thepolicy learning of key decisions within a sparse domain. This paper emphasizescombining the advantages of inputoutput hidden Markov models and reinforcementlearning towards interpretable maintenance decisions. We propose a novelhierarchical-modeling methodology that, at a high level, detects and interpretsthe root cause of a failure as well as the health degradation of the turbofanengine, while, at a low level, it provides the optimal replacement policy. Itoutperforms the baseline performance of deep reinforcement learning methodsapplied directly to the raw data or when using a hidden Markov model withoutsuch a specialized hierarchy. It also provides comparable performance to priorwork, however, with the additional benefit of interpretability.