DAIL: Beyond Task Ambiguity for Language-Conditioned Reinforcement Learning

Abstract

Comprehending natural language and following human instructions are criticalcapabilities for intelligent agents. However, the flexibility of linguisticinstructions induces substantial ambiguity across language-conditioned tasks,severely degrading algorithmic performance. To address these limitations, wepresent a novel method named DAIL (Distributional Aligned Learning), featuringtwo key components: distributional policy and semantic alignment. Specifically,we provide theoretical results that the value distribution estimation mechanismenhances task differentiability. Meanwhile, the semantic alignment modulecaptures the correspondence between trajectories and linguistic instructions.Extensive experimental results on both structured and visual observationbenchmarks demonstrate that DAIL effectively resolves instruction ambiguities,achieving superior performance to baseline methods. Our implementation isavailable at https://github.com/RunpengXie/Distributional-Aligned-Learning.

Quick Read (beta)

loading the full paper ...