Abstract:
Transformer-based methods have achieved excellent performance in single image super-resolution (SISR) due to their ability to model long-range dependency. However, most e...Show MoreMetadata
Abstract:
Transformer-based methods have achieved excellent performance in single image super-resolution (SISR) due to their ability to model long-range dependency. However, most existing methods require huge computational resources, making it difficult to apply on mobile devices with limited computing and storage resources. In this paper, we propose a lightweight CNN and spatial-channel Transformer hybrid network (CSCTHN), which adopts spatial and channel self-attention alternately and leverages the local extraction capability of CNN. CSCTHN’s basic unit CNN and Transformer hybrid module (CTHM) comprises three key components: dual-branch interactive spatial self-attention block (DISSAB) for capturing spatial context with lower computational cost, channel self-attention block (CSAB) for capturing channel information and local feature enhancement block (LFEB) that utilizes CNN to extract local information. Extensive experiments demonstrate that CSCTHN is superior to the state-of-the-art methods in terms of reconstruction performance and model complexity (e.g.,31.33dB@Manga109 ×4 with only 706K parameters).
Date of Conference: 15-19 July 2024
Date Added to IEEE Xplore: 30 September 2024
ISBN Information: