Adaptive $Q$-Aid for Conditional Supervised Learning in Offline Reinforcement Learning

  • 2024-10-22 13:46:09
  • Jeonghye Kim, Suyoung Lee, Woojun Kim, Youngchul Sung
  • 0

Abstract

Offline reinforcement learning (RL) has progressed with return-conditionedsupervised learning (RCSL), but its lack of stitching ability remains alimitation. We introduce $Q$-Aided Conditional Supervised Learning (QCS), whicheffectively combines the stability of RCSL with the stitching capability of$Q$-functions. By analyzing $Q$-function over-generalization, which impairsstable stitching, QCS adaptively integrates $Q$-aid into RCSL's loss functionbased on trajectory return. Empirical results show that QCS significantlyoutperforms RCSL and value-based methods, consistently achieving or exceedingthe maximum trajectory returns across diverse offline RL benchmarks.

 

Quick Read (beta)

loading the full paper ...