Learning to Seek: Autonomous Source Seeking with Deep Reinforcement Learning Onboard a Nano Drone Microcontroller

Abstract

Fully autonomous navigation using nano drones has numerous applications inthe real world, ranging from search and rescue to source seeking. Nano dronesare well-suited for source seeking because of their agility, low price, andubiquitous character. Unfortunately, their constrained form factor limitsflight time, sensor payload, and compute capability. These challenges are acrucial limitation for the use of source-seeking nano drones in GPS-denied andhighly cluttered environments. Hereby, we introduce a fully autonomous deepreinforcement learning-based light-seeking nano drone. The 33-gram nano droneperforms all computation on-board the ultra-low-power microcontroller (MCU). Wepresent the method for efficiently training, converting, and utilizing deepreinforcement learning policies. Our training methodology and novelquantization scheme allow fitting the trained policy in 3 kB of memory. Thequantization scheme uses representative input data and input scaling to arriveat a full 8-bit model. Finally, we evaluate the approach in simulation andflight tests using a Bitcraze CrazyFlie, achieving 80% success rate on averagein a highly cluttered and randomized test environment. Even more, the dronefinds the light source in 29% fewer steps compared to a baseline simulation(obstacle avoidance without source information). To our knowledge, this is thefirst deep reinforcement learning method that enables source seeking within ahighly constrained nano drone demonstrating robust flight behavior. Our generalmethodology is suitable for any (source seeking) highly constrained platformusing deep reinforcement learning.

Quick Read (beta)

loading the full paper ...