Learning to Fly via Deep Model-Based Reinforcement Learning

Abstract

Learning to control robots without requiring models has been a long-termgoal, promising diverse and novel applications. Yet, reinforcement learning hasonly achieved limited impact on real-time robot control due to its high demandof real-world interactions. In this work, by leveraging a learnt probabilisticmodel of drone dynamics, we achieve human-like quadrotor control throughmodel-based reinforcement learning. No prior knowledge of the flight dynamicsis assumed; instead, a sequential latent variable model, used generatively andas an online filter, is learnt from raw sensory input. The controller and valuefunction are optimised entirely by propagating stochastic analytic gradientsthrough generated latent trajectories. We show that "learning to fly" can beachieved with less than 30 minutes of experience with a single drone, and canbe deployed solely using onboard computational resources and sensors, on aself-built drone.

Quick Read (beta)

loading the full paper ...