Utility-Based Reinforcement Learning: Unifying Single-objective and Multi-objective Reinforcement Learning

  • 2024-02-05 01:42:28
  • Peter Vamplew, Cameron Foale, Conor F. Hayes, Patrick Mannion, Enda Howley, Richard Dazeley, Scott Johnson, Johan Källström, Gabriel Ramos, Roxana Rădulescu, Willem Röpke, Diederik M. Roijers
  • 0

Abstract

Research in multi-objective reinforcement learning (MORL) has introduced theutility-based paradigm, which makes use of both environmental rewards and afunction that defines the utility derived by the user from those rewards. Inthis paper we extend this paradigm to the context of single-objectivereinforcement learning (RL), and outline multiple potential benefits includingthe ability to perform multi-policy learning across tasks relating to uncertainobjectives, risk-aware RL, discounting, and safe RL. We also examine thealgorithmic implications of adopting a utility-based approach.

 

Quick Read (beta)

loading the full paper ...