Learning to Share and Hide Intentions using Information Regularization

Abstract

Learning to cooperate with friends and compete with foes is a key componentof multi-agent reinforcement learning. Typically to do so, one requires accessto either a model of or interaction with the other agent(s). Here we show howto learn effective strategies for cooperation and competition in an asymmetricinformation game with no such model or interaction. Our approach is toencourage an agent to reveal or hide their intentions using aninformation-theoretic regularizer. We consider both the mutual informationbetween goal and action given state, as well as the mutual information betweengoal and state. We show how to stochastically optimize these regularizers in away that is easy to integrate with policy gradient reinforcement learning.Finally, we demonstrate that cooperative (competitive) policies learned withour approach lead to more (less) reward for a second agent in two simpleasymmetric information games.

Quick Read (beta)

loading the full paper ...