Intent-aware Multi-agent Reinforcement Learning

Abstract

This paper proposes an intent-aware multi-agent planning framework as well asa learning algorithm. Under this framework, an agent plans in the goal space tomaximize the expected utility. The planning process takes the belief of otheragents' intents into consideration. Instead of formulating the learning problemas a partially observable Markov decision process (POMDP), we propose a simplebut effective linear function approximation of the utility function. It isbased on the observation that for humans, other people's intents will pose aninfluence on our utility for a goal. The proposed framework has several majoradvantages: i) it is computationally feasible and guaranteed to converge. ii)It can easily integrate existing intent prediction and low-level planningalgorithms. iii) It does not suffer from sparse feedbacks in the action space.We experiment our algorithm in a real-world problem that is non-episodic, andthe number of agents and goals can vary over time. Our algorithm is trained ina scene in which aerial robots and humans interact, and tested in a novel scenewith a different environment. Experimental results show that our algorithmachieves the best performance and human-like behaviors emerge during thedynamic process.

Quick Read (beta)

loading the full paper ...