Pareto Set Learning for Multi-Objective Reinforcement Learning

Abstract

Multi-objective decision-making problems have emerged in numerous real-worldscenarios, such as video games, navigation and robotics. Considering the clearadvantages of Reinforcement Learning (RL) in optimizing decision-makingprocesses, researchers have delved into the development of Multi-Objective RL(MORL) methods for solving multi-objective decision problems. However, previousmethods either cannot obtain the entire Pareto front, or employ only a singlepolicy network for all the preferences over multiple objectives, which may notproduce personalized solutions for each preference. To address theselimitations, we propose a novel decomposition-based framework for MORL, ParetoSet Learning for MORL (PSL-MORL), that harnesses the generation capability ofhypernetwork to produce the parameters of the policy network for eachdecomposition weight, generating relatively distinct policies for variousscalarized subproblems with high efficiency. PSL-MORL is a general framework,which is compatible for any RL algorithm. The theoretical result guarantees thesuperiority of the model capacity of PSL-MORL and the optimality of theobtained policy network. Through extensive experiments on diverse benchmarks,we demonstrate the effectiveness of PSL-MORL in achieving dense coverage of thePareto front, significantly outperforming state-of-the-art MORL methods in thehypervolume and sparsity indicators.

Quick Read (beta)

loading the full paper ...