A Broad-persistent Advising Approach for Deep Interactive Reinforcement Learning in Robotic Environments

Abstract

Deep Reinforcement Learning (DeepRL) methods have been widely used inrobotics to learn about the environment and acquire behaviors autonomously.Deep Interactive Reinforcement Learning (DeepIRL) includes interactive feedbackfrom an external trainer or expert giving advice to help learners choosingactions to speed up the learning process. However, current research has beenlimited to interactions that offer actionable advice to only the current stateof the agent. Additionally, the information is discarded by the agent after asingle use that causes a duplicate process at the same state for a revisit. Inthis paper, we present Broad-persistent Advising (BPA), a broad-persistentadvising approach that retains and reuses the processed information. It notonly helps trainers to give more general advice relevant to similar statesinstead of only the current state but also allows the agent to speed up thelearning process. We test the proposed approach in two continuous roboticscenarios, namely, a cart pole balancing task and a simulated robot navigationtask. The obtained results show that the performance of the agent using BPAimproves while keeping the number of interactions required for the trainer incomparison to the DeepIRL approach.

Quick Read (beta)

loading the full paper ...