HMRL: Hyper-Meta Learning for Sparse Reward Reinforcement Learning Problem

  • 2021-06-05 06:36:21
  • Yun Hua, Xiangfeng Wang, Bo Jin, Wenhao Li, Junchi Yan, Xiaofeng He, Hongyuan Zha
In spite of the success of existing meta reinforcement learning methods, theystill have difficulty in learning a meta policy effectively for RL problemswith sparse reward. In this respect, we develop a novel meta reinforcementlearning framework called Hyper-Meta RL(HMRL), for sparse reward RL problems.It is consisted with three modules including the cross-environment meta stateembedding module which constructs a common meta state space to adapt todifferent environments; the meta state based environment-specific meta rewardshaping which effectively extends the original sparse reward trajectory bycross-environmental knowledge complementarity and as a consequence the metapolicy achieves better generalization and efficiency with the shaped metareward. Experiments with sparse-reward environments show the superiority ofHMRL on both transferability and policy learning efficiency.


