Abstract
Humans can naturally and effectively find salient regions in complex scenes.Motivated by this observation, attention mechanisms were introduced intocomputer vision with the aim of imitating this aspect of the human visualsystem. Such an attention mechanism can be regarded as a dynamic weightadjustment process based on features of the input image. Attention mechanismshave achieved great success in many visual tasks, including imageclassification, object detection, semantic segmentation, video understanding,image generation, 3D vision, multi-modal tasks and self-supervised learning. Inthis survey, we provide a comprehensive review of various attention mechanismsin computer vision and categorize them according to approach, such as channelattention, spatial attention, temporal attention and branch attention; arelated repository https://github.com/MenghaoGuo/Awesome-Vision-Attentions isdedicated to collecting related work. We also suggest future directions forattention mechanism research.