SuperYOLO: Super Resolution Assisted Object Detection in Multimodal Remote Sensing Imagery

Abstract

In this paper, we propose an accurate yet fast small object detection methodfor RSI, named SuperYOLO, which fuses multimodal data and performs highresolution (HR) object detection on multiscale objects by utilizing theassisted super resolution (SR) learning and considering both the detectionaccuracy and computation cost. First, we construct a compact baseline byremoving the Focus module to keep the HR features and significantly overcomesthe missing error of small objects. Second, we utilize pixel-level multimodalfusion (MF) to extract information from various data to facilitate moresuitable and effective features for small objects in RSI. Furthermore, wedesign a simple and flexible SR branch to learn HR feature representations thatcan discriminate small objects from vast backgrounds with low-resolution (LR)input, thus further improving the detection accuracy. Moreover, to avoidintroducing additional computation, the SR branch is discarded in the inferencestage and the computation of the network model is reduced due to the LR input.Experimental results show that, on the widely used VEDAI RS dataset, SuperYOLOachieves an accuracy of 73.61% (in terms of mAP50), which is more than 10%higher than the SOTA large models such as YOLOv5l, YOLOv5x and RS designedYOLOrs. Meanwhile, the GFOLPs and parameter size of SuperYOLO are about 18.1xand 4.2x less than YOLOv5x. Our proposed model shows a favorable accuracy-speedtrade-off compared to the state-of-art models. The code will be open sourced athttps://github.com/icey-zhang/SuperYOLO.

Quick Read (beta)

loading the full paper ...