Abstract
Traffic accident anticipation aims to accurately and promptly predict theoccurrence of a future accident from dashcam videos, which is vital for asafety-guaranteed self-driving system. To encourage an early and accuratedecision, existing approaches typically focus on capturing the cues of spatialand temporal context before a future accident occurs. However, theirdecision-making lacks visual explanation and ignores the dynamic interactionwith the environment. In this paper, we propose Deep ReInforced accidentanticipation with Visual Explanation, named DRIVE. The method simulates boththe bottom-up and top-down visual attention mechanism in a dashcam observationenvironment so that the decision from the proposed stochastic multi-task agentcan be visually explained by attentive regions. Moreover, the proposed denseanticipation reward and sparse fixation reward are effective in training theDRIVE model with our improved reinforcement learning algorithm. Experimentalresults show that the DRIVE model achieves state-of-the-art performance onmultiple real-world traffic accident datasets. The code and pre-trained modelwill be available upon paper acceptance.