Augmented Random Search for Quadcopter Control: An alternative to Reinforcement Learning

Abstract

Model-based reinforcement learning strategies are believed to exhibit moresignificant sample complexity than model-free strategies to control dynamicalsystems,such as quadcopters.This belief that Model-based strategies thatinvolve the use of well-trained neural networks for making such high-leveldecisions always give better performance can be dispelled by making use ofModel-free policy search methods.This paper proposes the use of a model-freerandom searching strategy,called Augmented Random Search(ARS),which is a betterand faster approach of linear policy training for continuous control tasks likecontrolling a Quadcopters flight.The method achieves state-of-the-art accuracyby eliminating the use of too much data for the training of neural networksthat are present in the previous approaches to the task of Quadcoptercontrol.The paper also highlights the performance results of the searchingstrategy used for this task in a strategically designed task environment withthe help of simulations.Reward collection performance over 1000 episodes andagents behavior in flight for augmented random search is compared with that ofthe behavior for reinforcement learning state-of-the-art algorithm,called DeepDeterministic policy gradient(DDPG).Our simulations and results manifest that ahigh variability in performance is observed in commonly used strategies forsample efficiency of such tasks but the built policy network of ARS-Quad canreact relatively accurately to step response providing a better performingalternative to reinforcement learning strategies.

Quick Read (beta)

loading the full paper ...