Vision-based Navigation Using Deep Reinforcement Learning

Abstract

Deep reinforcement learning (RL) has been successfully applied to a varietyof game-like environments. However, the application of deep RL to visualnavigation with realistic environments is a challenging task. We propose anovel learning architecture capable of navigating an agent, e.g. a mobilerobot, to a target given by an image. To achieve this, we have extended thebatched A2C algorithm with auxiliary tasks designed to improve visualnavigation performance. We propose three additional auxiliary tasks: predictingthe segmentation of the observation image and of the target image andpredicting the depth-map. These tasks enable the use of supervised learning topre-train a large part of the network and to reduce the number of trainingsteps substantially. The training performance has been further improved byincreasing the environment complexity gradually over time. An efficient neuralnetwork structure is proposed, which is capable of learning for multipletargets in multiple environments. Our method navigates in continuous statespaces and on the AI2-THOR environment simulator outperforms state-of-the-artgoal-oriented visual navigation methods from the literature.

Quick Read (beta)

loading the full paper ...