Goal-Conditioned Reinforcement Learning in the Presence of an Adversary

Abstract

Reinforcement learning has seen increasing applications in real-worldcontexts over the past few years. However, physical environments are oftenimperfect and policies that perform well in simulation might not achieve thesame performance when applied elsewhere. A common approach to combat this is totrain agents in the presence of an adversary. An adversary acts to destabilisethe agent, which learns a more robust policy and can better handle realisticconditions. Many real-world applications of reinforcement learning also makeuse of goal-conditioning: this is particularly useful in the context ofrobotics, as it allows the agent to act differently, depending on which goal isselected. Here, we focus on the problem of goal-conditioned learning in thepresence of an adversary. We first present DigitFlip and CLEVR-Play, two novelgoal-conditioned environments that support acting against an adversary. Next,we propose EHER and CHER -- two HER-based algorithms for goal-conditionedlearning -- and evaluate their performance. Finally, we unify the two threadsand introduce IGOAL: a novel framework for goal-conditioned learning in thepresence of an adversary. Experimental results show that combining IGOAL withEHER allows agents to significantly outperform existing approaches, when actingagainst both random and competent adversaries.

Quick Read (beta)

loading the full paper ...