FiDi-RL: Incorporating Deep Reinforcement Learning with Finite-Difference Policy Search for Efficient Learning of Continuous Control

Abstract

In recent years significant progress has been made in dealing withchallenging problems using reinforcement learning.Despite its great success,reinforcement learning still faces challenge in continuous control tasks.Conventional methods always compute the derivatives of the optimal goal with acostly computation resources, and are inefficient, unstable and lack ofrobust-ness when dealing with such tasks. Alternatively, derivative-basedmethods treat the optimization process as a blackbox and show robustness andstability in learning continuous control tasks, but not data efficient inlearning. The combination of both methods so as to get the best of the both hasraised attention. However, most of the existing combination works adopt complexneural networks (NNs) as the policy for control. The double-edged sword of deepNNs can yield better performance, but also makes it difficult for parametertuning and computation. To this end, in this paper we presents a novel methodcalled FiDi-RL, which incorporates deep RL with Finite-Difference (FiDi) policysearch.FiDi-RL combines Deep Deterministic Policy Gradients (DDPG)with AugmentRandom Search (ARS) and aims at improving the data efficiency of ARS. Theempirical results show that FiDi-RL can improves the performance and stabilityof ARS, and provide competitive results against some existing deepreinforcement learning methods

Quick Read (beta)

loading the full paper ...