### Abstract

Constrained reinforcement learning is to maximize the expected reward subjectto constraints on utilities/costs. However, the training environment may not bethe same as the test one, due to, e.g., modeling error, adversarial attack,non-stationarity, resulting in severe performance degradation and moreimportantly constraint violation. We propose a framework of robust constrainedreinforcement learning under model uncertainty, where the MDP is not fixed butlies in some uncertainty set, the goal is to guarantee that constraints onutilities/costs are satisfied for all MDPs in the uncertainty set, and tomaximize the worst-case reward performance over the uncertainty set. We designa robust primal-dual approach, and further theoretically develop guarantee onits convergence, complexity and robust feasibility. We then investigate aconcrete example of $\delta$-contamination uncertainty set, design an onlineand model-free algorithm and theoretically characterize its sample complexity.