Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained Multi-Objective Reinforcement Learning Approach

  • 2024-09-24 06:25:24
  • Dohyeong Kim, Hyeokjin Kwon, Junseok Kim, Gunmin Lee, Songhwai Oh
  • 0

Abstract

As the complexity of tasks addressed through reinforcement learning (RL)increases, the definition of reward functions also has become highlycomplicated. We introduce an RL method aimed at simplifying the reward-shapingprocess through intuitive strategies. Initially, instead of a single rewardfunction composed of various terms, we define multiple reward and costfunctions within a constrained multi-objective RL (CMORL) framework. For tasksinvolving sequential complex movements, we segment the task into distinctstages and define multiple rewards and costs for each stage. Finally, weintroduce a practical CMORL algorithm that maximizes objectives based on theserewards while satisfying constraints defined by the costs. The proposed methodhas been successfully demonstrated across a variety of acrobatic tasks in bothsimulation and real-world environments. Additionally, it has been shown tosuccessfully perform tasks compared to existing RL and constrained RLalgorithms. Our code is available athttps://github.com/rllab-snu/Stage-Wise-CMORL.

 

Quick Read (beta)

loading the full paper ...