Single image super-resolution with attention-based densely connected module
Introduction
Single image super-resolution (SISR) is a low-level computer vision task, which aims to reconstruct the high-resolution image (HR) from its counterpart low-resolution image (LR). It has been widely used in multimedia and medical image processing fields and received an amount of research interest recently.
With the renaissance of deep learning, SISR achieves significant progress. Many super-resolution networks [1], [2], [3], [4], [5], [6], [7], [8], [9], [10] are designed to reconstruct HR images by learning a nonlinear mapping relationship between the LR-HR pairs. For example, SRCNN [1] applies three convolutional layers to learn the mapping relationship between LR images and HR images; LapSRN [2] uses three networks to estimate the residual information for progressively super-resolution; DRCN [3] utilizes residual learning to increase the depth of super-resolution network and achieves promising results; SRDenseNet [4] uses dense connection operations proposed by [11] to avoid the vanish gradient problem when model training, and learns a compact model for SISR.
Among them, benefited from the abundant features provided by the dense connection block, SRDenseNet achieves higher reconstruction accuracy compared with the other super-resolution networks [1], [2], [3]. However, the abundant features also include irrelevant information, which affects the quality of final image reconstruction. The high computational cost of the dense connection block also brings an obstacle for SRDenseNet to be applied in the real-world application. Therefore, reducing the effect of redundant features provided by the dense connection block is an effective way to improve the performance of SRDenseNet.
Inspired by the neural attention mechanism, many CNN-based attention modules [12], [13], [14], [15], [16], [17], [18], [19], [20], [21], [22], [23], [24], [25] are proposed recently, which aim to enhance the task-relevant feature representation and suppress the task-irrelevant feature representation. Therefore, in this paper, we apply the neural attention mechanism to SRDenseNet to decrease feature redundancy, and propose a novel attention-based densely connected module (DAM). DAM includes two parts: channel attention module (CAM) and dense connection block (DB). CAM is placed at the front of each DB and generates channel weight scores to re-weight each channel feature map for suppressing redundant information. Further, we apply the proposed DAM to construct an Attention-based Densely connected network (ADSRNet) for SISR. With the help of DAM, ADSRNet generates more real-realistic SR images than SRDenseNet. Besides, we also explore the effectiveness of our proposed DAM on other densely-connected networks (e.g., RDN [26], and DBPN [27]). We perform extensive experiments on commonly-used super-resolution benchmarks. Experiment results demonstrate the effectiveness of our method.
To summarize, the contributions of our work are described as follows:
- •
We propose the DAM for SISR, which can effectively reduce redundant information from DB and improve image reconstruction accuracy.
- •
We propose the ADSRNet for SISR, which achieves promising results on commonly-used super-resolution benchmarks.
This paper extends our conference paper [28] in three aspects. Firstly, to evaluate the effectiveness of DAM, we also combine DAM with other densely connected super-resolution networks, such as RDN [26] and DBPN [27], and perform extensive experiments to see the effect of DAM. Secondly, we compare our proposed DAM with other attention modules through quantitative and qualitative analysis. Thirdly, we extend the evaluation benchmark and evaluate ADSRNet on Urban100, and our model also achieves promising results.
The rest of this paper is as follows. We review the related works of our paper in Section 2. In Section 3, we introduce our proposed method in detail. In Section 4, we demonstrate the implementation details and present extensive experiments on public super-resolution benchmarks. We conclude our paper in Section 5.
Section snippets
Neural network-based single image super-resolution
Benefited from the success of deep learning technology, numerous deep learning-based methods are proposed recently. SRCNN [1] was the first proposed deep convolutional neural network for SISR. They firstly applied the bicubic interpolation to super-resolved the low-resolution image to the target size, and then learned the mapping relation from LR to HR images through a three-layer convolutional neural network. Kim et al. proposed a deeply-recursive convolutional network DRCN [3] and VDSR [29].
Proposed method
In this section, we first introduce the details of our proposed DAM. Then, we demonstrate the detailed components of our proposed ADSRNet.
Experiment
In this section, we first introduce the evaluation datasets and metrics. Then, we describe the implementation details of the experiments. After that, we introduce the comparison results with other state-of-the-art methods. Finally, we give a comprehensive model analysis of our proposed ADSRNet.
Conclusions
In this paper, we propose a new attention module DAM to decrease redundant information provided by DB, and propose the ADSRNet for SISR. With DAM’s help, ADSRNet generates more photo-realistic image reconstruction results compared with other super-resolution networks and achieves high reconstruction accuracy. Experiment results on commonly-used super-resolution benchmarks demonstrate the effectiveness of our method. However, our method is only developed for SISR on synthetic data and does not
CRediT authorship contribution statement
Zijian Wang: Conceptualization, Methodology, Software, Writing - original draft. Yao Lu: Methodology, Writing - review & editing. Weiqi Li: Conceptualization, Methodology, Writing - review & editing. Shunzhou Wang: Writing - review & editing. Xuebo Wang: Writing - review & editing. Xiaozhen Chen: Writing - review & editing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 61273273), by the National Key Research and Development Plan (No. 2017YFC0112001), and by China Central Television (JG2018-0247).
Zijian Wang received the B.S. degree of animation technology from Communication University of China, in 2004, and M.E. degree of software engineering from Beijing University of Posts and Telecommunications, in 2011. He is engaged in computer visual effect development in China Central Television. His research interests include image processing and pattern recognition.
References (49)
- et al.
Recovering realistic texture in image super-resolution by deep spatial feature transform
- et al.
Image super-resolution using deep convolutional networks
IEEE Trans. Pattern Anal. Mach. Intell.
(2015) - et al.
Deep laplacian pyramid networks for fast and accurate super-resolution
- et al.
Deeply-recursive convolutional network for image super-resolution
- T. Tong, G. Li, X. Liu, Q. Gao, Image super-resolution using dense skip connections, in: Proceedings of the IEEE...
- et al.
Closed-loop matters: dual regression networks for single image super-resolution
- et al.
Image super-resolution with cross-scale non-local attention and exhaustive self-exemplars mining
- G. Li, Y. Lu, L. Lu, Z. Wu, X. Wang, S. Wang, Semi-blind super-resolution with kernel-guided feature modification, in:...
- et al.
Residual feature aggregation network for image super-resolution
- et al.
Perceptual extreme super-resolution network with receptive field block
Lightweight image super-resolution with information multi-distillation network
Densely connected convolutional networks
Squeeze-and-excitation networks
Cascaded human-object interaction recognition
Residual attention network for image classification
Learning unsupervised video object segmentation through visual attention
Non-local neural networks
Real image denoising with feature attention
Salient object detection with pyramid attention and salient edges
Cited by (15)
Efficient self-calibrated and hierarchical refinement network for lightweight super-resolution
2024, Digital Signal Processing: A Review JournalImage super-resolution with multi-scale fractal residual attention network
2023, Computers and Graphics (Pergamon)Single-image HDR reconstruction by dual learning the camera imaging process
2023, Engineering Applications of Artificial IntelligenceMulti-scale receptive field fusion network for lightweight image super-resolution
2022, NeurocomputingCitation Excerpt :In general, many HR images can be degraded to an identical LR image, making the procedure ill-posed and difficult to handle. To address this problem, numerous CNN-based SISR methods [1–5] have been proposed and shown satisfactory performance. However, these methods have a large number of parameters and expensive computational consumption.
Quality enhancement of compressed screen content video by cross-frame information fusion
2022, NeurocomputingCitation Excerpt :At present, the widely used attention mechanisms can be divided into three categories: channel attention, spatial attention and mixed attention. In the field of computer vision, the attention mechanism is widely used, and it has achieved very good results in the object detection [2], object classification [28] and the super-resolution field [4,32]. However, in the field of screen content video, few people use the attention mechanism for video enhancement.
Zijian Wang received the B.S. degree of animation technology from Communication University of China, in 2004, and M.E. degree of software engineering from Beijing University of Posts and Telecommunications, in 2011. He is engaged in computer visual effect development in China Central Television. His research interests include image processing and pattern recognition.
Yao Lu received the B.S. degree in electronics from Northeast University, Shenyang, China, in 1982 and the Ph.D. degree in computer science from Gunma University, Gunma, Japan, in 2003. He was a Lecturer and an Associate Professor with Hebei University, China, from 1986 to 1998, and a foreign researcher with Gunma University in 1999. In 2003, he was an invited professor with the Engineering Faculty, Gunma Universitya Visiting Fellow of University of Sydney, Australia. He is currently a Professor with the Department of Computer Science, Beijing Institute of Technology, Beijing, China. He has published more than 100 papers in international conferences and journals. His research interests include neural network, image processing and video analysis, and pattern recognition.
Weiqi Li received the B.S. degree from Jilin University, in 2017. She is currently pursuing the M.E. degree in Computer Science at Beijing Institute of Technology, Beijing, China. Her main research interests include Image processing and pattern recognition.
Shunzhou Wang is currently pursuing the Ph.D. degree at Beijing Laboratory of Intelligent Information Technology, School of Computer Science, Beijing Institute of Technology, Beijing, China, from 2018. His supervisor is Prof. Yao Lu and his main research interests include computer vision and deep learning.
Xuebo Wang received the B.S. degree from Taiyuan University of Technology, in 2017. He is currently pursuing the M.E. degree in Computer Science at Beijing Institute of Technology, Beijing, China. His main research interests include Image processing and pattern recognition.
Xiaozhen Chen received the B.S. degree from Shenyang Institute of Engineering, in 2015. She is currently pursuing the M.E. degree in Biomedical Engineering at Beijing Institute of Technology, Beijing, China. Her main research interests include Image processing and pattern recognition.