Video Crowd Counting via Dynamic Temporal Modeling

  • 2019-07-04 03:07:22
  • Xingjiao Wu, Baohan Xu, Yingbin Zheng, Hao Ye, Jing Yang, Liang He
  • 16

Abstract

Crowd counting aims to count the number of instantaneous people in a crowdedspace, which plays an increasingly important role in the field of publicsafety. More and more researchers have already proposed many promisingsolutions to the crowd counting task on the image. With the continuousextension of the application of crowd counting, how to apply the technique tovideo content has become an urgent problem. At present, although researchershave collected and labeled some video clips, less attention has been drawn tothe spatiotemporal characteristics of videos. In order to solve this problem,this paper proposes a novel framework based on dynamic temporal modeling of therelationship between video frames. We model the relationship between adjacentfeatures by constructing a set of dilated residual blocks for crowd countingtask, with each phase having an expanded set of time convolutions to generatean initial prediction which is then improved by the next prediction. We extractfeatures from the density map as we find the adjacent density maps share moresimilar information than original video frames. We also propose a smaller basicnetwork structure to balance the computational cost with a good featurerepresentation. We conduct experiments using the proposed framework on fivecrowd counting datasets and demonstrate its superiority in terms ofeffectiveness and efficiency over previous approaches.

 

Quick Read (beta)

loading the full paper ...