Offline Inverse Constrained Reinforcement Learning for Safe-Critical Decision Making in Healthcare

Abstract

Reinforcement Learning (RL) applied in healthcare can lead to unsafe medicaldecisions and treatment, such as excessive dosages or abrupt changes, often dueto agents overlooking common-sense constraints. Consequently, ConstrainedReinforcement Learning (CRL) is a natural choice for safe decisions. However,specifying the exact cost function is inherently difficult in healthcare.Recent Inverse Constrained Reinforcement Learning (ICRL) is a promisingapproach that infers constraints from expert demonstrations. ICRL algorithmsmodel Markovian decisions in an interactive environment. These settings do notalign with the practical requirement of a decision-making system in healthcare,where decisions rely on historical treatment recorded in an offline dataset. Totackle these issues, we propose the Constraint Transformer (CT). Specifically,1) we utilize a causal attention mechanism to incorporate historical decisionsand observations into the constraint modeling, while employing a Non-Markovianlayer for weighted constraints to capture critical states. 2) A generativeworld model is used to perform exploratory data augmentation, enabling offlineRL methods to simulate unsafe decision sequences. In multiple medicalscenarios, empirical results demonstrate that CT can capture unsafe states andachieve strategies that approximate lower mortality rates, reducing theoccurrence probability of unsafe behaviors.

Quick Read (beta)

loading the full paper ...