Divide and Conquer: Provably Unveiling the Pareto Front with Multi-Objective Reinforcement Learning

  • 2025-02-03 11:02:03
  • Willem Röpke, Mathieu Reymond, Patrick Mannion, Diederik M. Roijers, Ann Nowé, Roxana Rădulescu
  • 0


An important challenge in multi-objective reinforcement learning is obtaininga Pareto front of policies to attain optimal performance under differentpreferences. We introduce Iterated Pareto Referent Optimisation (IPRO), whichdecomposes finding the Pareto front into a sequence of constrainedsingle-objective problems. This enables us to guarantee convergence whileproviding an upper bound on the distance to undiscovered Pareto optimalsolutions at each step. We evaluate IPRO using utility-based metrics and itshypervolume and find that it matches or outperforms methods that requireadditional assumptions. By leveraging problem-specific single-objectivesolvers, our approach also holds promise for applications beyondmulti-objective reinforcement learning, such as planning and pathfinding.


Quick Read (beta)

loading the full paper ...