Learning to Coordinate with Experts

Abstract

When deployed in dynamic environments, AI agents will inevitably encounterchallenges that exceed their individual capabilities. Leveraging assistancefrom expert agents-whether human or AI-can significantly enhance safety andperformance in such situations. However, querying experts is often costly,necessitating the development of agents that can efficiently request andutilize expert guidance. In this paper, we introduce a fundamental coordinationproblem called Learning to Yield and Request Control (YRC), where the objectiveis to learn a strategy that determines when to act autonomously and when toseek expert assistance. We consider a challenging practical setting in which anagent does not interact with experts during training but must adapt to novelenvironmental changes and expert interventions at test time. To facilitateempirical research, we introduce YRC-Bench, an open-source benchmark featuringdiverse domains. YRC-Bench provides a standardized Gym-like API, simulatedexperts, evaluation pipeline, and implementation of competitive baselines.Towards tackling the YRC problem, we propose a novel validation approach andinvestigate the performance of various learning methods across diverseenvironments, yielding insights that can guide future research.

Quick Read (beta)

loading the full paper ...