Using Natural Language for Reward Shaping in Reinforcement Learning

Abstract

Recent reinforcement learning (RL) approaches have shown strong performancein complex domains such as Atari games, but are often highly sampleinefficient. A common approach to reduce interaction time with the environmentis to use reward shaping, which involves carefully designing reward functionsthat provide the agent intermediate rewards for progress towards the goal.However, designing appropriate shaping rewards is known to be difficult as wellas time-consuming. In this work, we address this problem by using naturallanguage instructions to perform reward shaping. We propose the LanguagE-ActionReward Network (LEARN), a framework that maps free-form natural languageinstructions to intermediate rewards based on actions taken by the agent. Theseintermediate language-based rewards can seamlessly be integrated into anystandard reinforcement learning algorithm. We experiment with Montezuma'sRevenge from the Atari Learning Environment, a popular benchmark in RL. Ourexperiments on a diverse set of 15 tasks demonstrate that, for the same numberof interactions with the environment, language-based rewards lead to successfulcompletion of the task 60% more often on average, compared to learning withoutlanguage.

Quick Read (beta)

loading the full paper ...