Deep Multi-Objective Reinforcement Learning for Utility-Based Infrastructural Maintenance Optimization

Abstract

In this paper, we introduce Multi-Objective Deep Centralized Multi-AgentActor-Critic (MO- DCMAC), a multi-objective reinforcement learning (MORL)method for infrastructural maintenance optimization, an area traditionallydominated by single-objective reinforcement learning (RL) approaches. Previoussingle-objective RL methods combine multiple objectives, such as probability ofcollapse and cost, into a singular reward signal through reward-shaping. Incontrast, MO-DCMAC can optimize a policy for multiple objectives directly, evenwhen the utility function is non-linear. We evaluated MO-DCMAC using twoutility functions, which use probability of collapse and cost as input. Thefirst utility function is the Threshold utility, in which MO-DCMAC shouldminimize cost so that the probability of collapse is never above the threshold.The second is based on the Failure Mode, Effects, and Criticality Analysis(FMECA) methodology used by asset managers to asses maintenance plans. Weevaluated MO-DCMAC, with both utility functions, in multiple maintenanceenvironments, including ones based on a case study of the historical quay wallsof Amsterdam. The performance of MO-DCMAC was compared against multiplerule-based policies based on heuristics currently used for constructingmaintenance plans. Our results demonstrate that MO-DCMAC outperformstraditional rule-based policies across various environments and utilityfunctions.

Quick Read (beta)

loading the full paper ...