Restormer: Efficient Transformer for High-Resolution Image Restoration

Abstract

Since convolutional neural networks (CNNs) perform well at learninggeneralizable image priors from large-scale data, these models have beenextensively applied to image restoration and related tasks. Recently, anotherclass of neural architectures, Transformers, have shown significant performancegains on natural language and high-level vision tasks. While the Transformermodel mitigates the shortcomings of CNNs (i.e., limited receptive field andinadaptability to input content), its computational complexity growsquadratically with the spatial resolution, therefore making it infeasible toapply to most image restoration tasks involving high-resolution images. In thiswork, we propose an efficient Transformer model by making several key designsin the building blocks (multi-head attention and feed-forward network) suchthat it can capture long-range pixel interactions, while still remainingapplicable to large images. Our model, named Restoration Transformer(Restormer), achieves state-of-the-art results on several image restorationtasks, including image deraining, single-image motion deblurring, defocusdeblurring (single-image and dual-pixel data), and image denoising (Gaussiangrayscale/color denoising, and real image denoising). The source code andpre-trained models are available at https://github.com/swz30/Restormer.

Quick Read (beta)

loading the full paper ...