Human Engagement Providing Evaluative and Informative Advice for Interactive Reinforcement Learning

  • 2020-09-21 02:14:02
  • Adam Bignold, Francisco Cruz, Richard Dazeley, Peter Vamplew, Cameron Foale
  • 6

Abstract

Reinforcement learning is an approach used by intelligent agents toautonomously learn new skills. Although reinforcement learning has beendemonstrated to be an effective learning approach in several differentcontexts, a common drawback exhibited is the time needed in order tosatisfactorily learn a task, especially in large state-action spaces. Toaddress this issue, interactive reinforcement learning proposes the use ofexternally-sourced information in order to speed up the learning process. Up tonow, different information sources have been used to give advice to the learneragent, among them human-sourced advice. When interacting with a learner agent,humans may provide either evaluative or informative advice. From the agent'sperspective these styles of interaction are commonly referred to asreward-shaping and policy-shaping respectively. Evaluation requires the humanto provide feedback on the prior action performed, while informative advicethey provide advice on the best action to select for a given situation. Priorresearch has focused on the effect of human-sourced advice on the interactivereinforcement learning process, specifically aiming to improve the learningspeed of the agent, while reducing the engagement with the human. This workpresents an experimental setup for a human-trial designed to compare themethods people use to deliver advice in term of human engagement. Obtainedresults show that users giving informative advice to the learner agents providemore accurate advice, are willing to assist the learner agent for a longertime, and provide more advice per episode. Additionally, self-evaluation fromparticipants using the informative approach has indicated that the agent'sability to follow the advice is higher, and therefore, they feel their ownadvice to be of higher accuracy when compared to people providing evaluativeadvice.

 

Quick Read (beta)

loading the full paper ...