Fast Retinomorphic Event Stream for Video Recognition and Reinforcement Learning

  • 2018-05-19 15:10:37
  • Wanjia Liu, Huaijin Chen, Rishab Goel, Yuzhong Huang, Ashok Veeraraghavan, Ankit Patel
  • 0

Abstract

Good temporal representations are crucial for video understanding, and thestate-of-the-art video recognition framework is based on two-stream networks.In such framework, besides the regular ConvNets responsible for RGB frameinputs, a second network is introduced to handle the temporal representation,usually the optical flow (OF). However, OF or other task-oriented flow iscomputationally costly, and is thus typically pre-computed. Critically, thisprevents the two-stream approach from being applied to reinforcement learning(RL) applications such as video game playing, where the next state depends oncurrent state and action choices. Inspired by the early vision systems ofmammals and insects, we propose a fast event-driven representation (EDR) thatmodels several major properties of early retinal circuits: (1) logarithmicinput response, (2) multi-timescale temporal smoothing to filter noise, and (3)bipolar (ON/OFF) pathways for primitive event detection[12]. Trading off thedirectional information for fast speed (> 9000 fps), EDR en-ables fastreal-time inference/learning in video applications that require interactionbetween an agent and the world such as game-playing, virtual robotics, anddomain adaptation. In this vein, we use EDR to demonstrate performanceimprovements over state-of-the-art reinforcement learning algorithms for Atarigames, something that has not been possible with pre-computed OF. Moreover,with UCF-101 video action recognition experiments, we show that EDR performsnear state-of-the-art in accuracy while achieving a 1,500x speedup in inputrepresentation processing, as compared to optical flow.

 

Quick Read (beta)

loading the full paper ...