Failed Goal Aware Hindsight Experience Replay

Abstract

In multi-goal reinforcement learning for a given environment, agents learnpolicies to achieve multiple goals by using experiences gained frominteractions with the environment. One of the key challenges in this setting istraining agents using sparse binary rewards, which can be difficult due to alack of successful experiences. To address this challenge, hindsight experiencereplay (HER) generates successful experiences from unsuccessful experiences.However, the process of generating successful experiences from uniformlysampled ones can be inefficient. In this paper, a novel approach called Failedgoal Aware HER (FAHER) is proposed to enhance the sampling efficiency. Theapproach exploits the property of achieved goals in relation to failed goalsthat are defined as the original goals not achieved. The proposed methodinvolves clustering episodes with different achieved goals using a clustermodel and subsequently sampling experiences in the manner of HER. The clustermodel is generated by applying a clustering algorithm to failed goals. Theproposed method is validated by experiments with three robotic control tasks ofthe OpenAI gym. The results of experiments demonstrate that the proposed methodis more sample efficient and achieves improved performance over baselineapproaches.

Quick Read (beta)

loading the full paper ...