Abstract
Inspired by human visual attention, we introduce a Maximum Entropy DeepInverse Reinforcement Learning (MEDIRL) framework for modeling the visualattention allocation of drivers in imminent rear-end collisions. MEDIRL iscomposed of visual, driving, and attention modules. Given a front-view drivingvideo and corresponding eye fixations from humans, the visual and drivingmodules extract generic and driving-specific visual features, respectively.Finally, the attention module learns the intrinsic task-sensitive rewardfunctions induced by eye fixation policies recorded from attentive drivers.MEDIRL uses the learned policies to predict visual attention allocation ofdrivers. We also introduce EyeCar, a new driver visual attention dataset duringaccident-prone situations. We conduct comprehensive experiments and show thatMEDIRL outperforms previous state-of-the-art methods on driving task-relatedvisual attention allocation on the following large-scale driving attentionbenchmark datasets: DR(eye)VE, BDD-A, and DADA-2000. The code and dataset areprovided for reproducibility.