Learning user-defined sub-goals using memory editing in reinforcement learning

Abstract

The aim of reinforcement learning (RL) is to allow the agent to achieve thefinal goal. Most RL studies have focused on improving the efficiency oflearning to achieve the final goal faster. However, the RL model is verydifficult to modify an intermediate route in the process of reaching the finalgoal. That is, the agent cannot be under control to achieve other sub-goals inthe existing studies. If the agent can go through the sub-goals on the way tothe destination, the RL can be applied and studied in various fields. In thisstudy, I propose a methodology to achieve the user-defined sub-goals as well asthe final goal using memory editing. The memory editing is performed togenerate various sub-goals and give an additional reward to the agent. Inaddition, the sub-goals are separately learned from the final goal. I set twosimple environments and various scenarios in the test environments. As aresult, the agent almost successfully passed the sub-goals as well as the finalgoal under control. Moreover, the agent was able to be induced to visit thenovel state indirectly in the environments. I expect that this methodology canbe used in the fields that need to control the agent in a variety of scenarios.

Quick Read (beta)

loading the full paper ...