Abstract
Current Deep Reinforcement Learning algorithms still heavily rely onhandcrafted neural network architectures. We propose a novel approach toautomatically find strong topologies for continuous control tasks while onlyadding a minor overhead in terms of interactions in the environment. To achievethis, we combine Neuroevolution techniques with off-policy training and proposea novel architecture mutation operator. Experiments on five continuous controlbenchmarks show that the proposed Actor-Critic Neuroevolution algorithm oftenoutperforms the strong Actor-Critic baseline and is capable of automaticallyfinding topologies in a sample-efficient manner which would otherwise have tobe found by expensive architecture search.
Quick Read (beta)
loading the full paper ...