Task-Aware Virtual Training: Enhancing Generalization in Meta-Reinforcement Learning for Out-of-Distribution Tasks

  • 2025-06-18 07:52:53
  • Jeongmo Kim, Yisak Park, Minung Kim, Seungyul Han
  • 0

Abstract

Meta reinforcement learning aims to develop policies that generalize tounseen tasks sampled from a task distribution. While context-based meta-RLmethods improve task representation using task latents, they often strugglewith out-of-distribution (OOD) tasks. To address this, we propose Task-AwareVirtual Training (TAVT), a novel algorithm that accurately captures taskcharacteristics for both training and OOD scenarios using metric-basedrepresentation learning. Our method successfully preserves task characteristicsin virtual tasks and employs a state regularization technique to mitigateoverestimation errors in state-varying environments. Numerical resultsdemonstrate that TAVT significantly enhances generalization to OOD tasks acrossvarious MuJoCo and MetaWorld environments. Our code is available athttps://github.com/JM-Kim-94/tavt.git.

 

Quick Read (beta)

loading the full paper ...