Boosting Long-Delayed Reinforcement Learning with Auxiliary Short-Delayed Task

  • 2024-02-05 16:11:03
  • Qingyuan Wu, Simon Sinong Zhan, Yixuan Wang, Chung-Wei Lin, Chen Lv, Qi Zhu, Chao Huang
  • 0


Reinforcement learning is challenging in delayed scenarios, a commonreal-world situation where observations and interactions occur with delays.State-of-the-art (SOTA) state-augmentation techniques either suffer from thestate-space explosion along with the delayed steps, or performance degenerationin stochastic environments. To address these challenges, our novelAuxiliary-Delayed Reinforcement Learning (AD-RL) leverages an auxiliaryshort-delayed task to accelerate the learning on a long-delayed task withoutcompromising the performance in stochastic environments. Specifically, AD-RLlearns the value function in the short-delayed task and then employs it withthe bootstrapping and policy improvement techniques in the long-delayed task.We theoretically show that this can greatly reduce the sample complexitycompared to directly learning on the original long-delayed task. Ondeterministic and stochastic benchmarks, our method remarkably outperforms theSOTAs in both sample efficiency and policy performance.


