Influencing Reinforcement Learning through Natural Language Guidance

Abstract

Interactive reinforcement learning agents use human feedback or instructionto help them learn in complex environments. Often, this feedback comes in theform of a discrete signal that is either positive or negative. Whileinformative, this information can be difficult to generalize on its own. Inthis work, we explore how natural language advice can be used to provide aricher feedback signal to a reinforcement learning agent by extending policyshaping, a well-known Interactive reinforcement learning technique. Usuallypolicy shaping employs a human feedback policy to help an agent to learn moreabout how to achieve its goal. In our case, we replace this human feedbackpolicy with policy generated based on natural language advice. We aim toinspect if the generated natural language reasoning provides support to a deepreinforcement learning agent to decide its actions successfully in any givenenvironment. So, we design our model with three networks: first one is theexperience driven, next is the advice generator and third one is the advicedriven. While the experience driven reinforcement learning agent chooses itsactions being influenced by the environmental reward, the advice driven neuralnetwork with generated feedback by the advice generator for any new stateselects its actions to assist the reinforcement learning agent to better policyshaping.

Quick Read (beta)

loading the full paper ...