End-to-end Active Object Tracking and Its Real-world Deployment via Reinforcement Learning

  • 2018-08-10 04:04:19
  • Wenhan Luo, Peng Sun, Fangwei Zhong, Wei Liu, Tong Zhang, Yizhou Wang
  • 8

Abstract

We study active object tracking, where a tracker takes visual observations(i.e., frame sequences) as inputs and produces the corresponding camera controlsignals as outputs (e.g., move forward, turn left, etc.). Conventional methodstackle tracking and camera control tasks separately, and the resulting systemis difficult to tune jointly. Such an approach also requires significant humanefforts for image labeling and expensive trial-and-error system tuning inreal-world. To address these issues, we propose, in this paper, an end-to-endsolution via deep reinforcement learning. A ConvNet-LSTM function approximatoris adopted for the direct frame-to-action prediction. We further propose anenvironment augmentation technique and a customized reward function which arecrucial for successful training. The tracker trained in simulators (ViZDoom andUnreal Engine) demonstrates good generalization behaviors in the case of unseenobject moving paths, unseen object appearances, unseen backgrounds, anddistracting objects. The system is robust and can restore tracking afteroccasional lost of the target being tracked. We also find that the trackingability, obtained solely from simulators, can potentially transfer toreal-world scenarios. We demonstrate successful examples of such transfer, viaexperiments over the VOT dataset and the deployment of a real-world robot usingthe proposed active tracker trained in simulation.

 

Quick Read (beta)

loading the full paper ...