Improved generative adversarial network for retinal image super-resolution

doi:10.1016/j.cmpb.2022.106995

Computer Methods and Programs in Biomedicine

Volume 225, October 2022, 106995

https://doi.org/10.1016/j.cmpb.2022.106995 Get rights and content

Highlights

•
Developed an improved generative adversarial network.
•
Designed a novel residual attention block.
•
Use the Charbonnier loss function instead of the MSE loss function.
•
Remove the BN layer and add multiple updated residual blocks.

Abstract

Background and objective

The retina is the only organ in the body that can use visible light for non-invasive observation. By analyzing retinal images, we can achieve early screening, diagnosis and prevention of many ophthalmological and systemic diseases, helping patients avoid the risk of blindness. Due to the powerful feature extraction capabilities, many deep learning super-resolution reconstruction networks have been applied to retinal image analysis and achieved excellent results.

Methods

Given the lack of high-frequency information and poor visual perception in the current reconstruction results of super-resolution reconstruction networks under large-scale factors, we present an improved generative adversarial network (IGAN) algorithm for retinal image super-resolution reconstruction. Firstly, we construct a novel residual attention block, improving the reconstruction results lacking high-frequency information and texture details under large-scale factors. Secondly, we remove the Batch Normalization layer that affects the quality of image generation in the residual network. Finally, we use the more robust Charbonnier loss function instead of the mean square error loss function and the TV regular term to smooth the training results.

Results

Experimental results show that our proposed method significantly improves objective evaluation indicators such as peak signal-to-noise ratio and structural similarity. The obtained image has rich texture details and a better visual experience than the state-of-the-art image super-resolution methods.

Conclusion

Our proposed method can better learn the mapping relationship between low-resolution and high-resolution retinal images. This method can be effectively and stably applied to the analysis of retinal images, providing an effective basis for early clinical treatment.

Introduction

Retinal image analysis is an important part of medical image analysis, which enables the diagnosis of many ophthalmologically relevant blinding diseases, such as retinoblastoma and age-related macular degeneration [1], [2]. In addition, due to the non-invasive method of taking retinal images, the super-resolution (SR) reconstruction technology of the retina helps experts achieve a more early and comprehensive diagnosis of blinding retinal diseases; at the same time, it can control the deterioration of the condition and help patients avoid the risk of blindness [3].

With the development of artificial intelligence, more and more researchers apply deep learning SR reconstruction technology to medical image processing, and deep learning-based retinal image analysis makes the diagnosis results more objective and quantitative [4]. At the same time, it does not require long-term professional training for experts to reach a level similar to that of experts in the industry, which solves the problem of the small number of experts in the field of ophthalmology in my country and the difficulties in screening in communities and offset areas, such as screening for retinopathy of prematurity, diabetic retinopathy, and another eye disease [5].

The single-image super-resolution (SISR) reconstruction is one of the important research directions in computer vision and image processing [6]. On the premise of not improving the hardware conditions of the imaging equipment, the image resolution is improved through signal processing and software methods, which is highly SR reconstruction. According to different reconstruction methods, standard SISR reconstruction algorithms can be divided into three categories: SR reconstruction based on interpolation [7], SR reconstruction based on reconstruction [8], and SR reconstruction based on learning [9].

Artificial intelligence technology has emerged in recent years, and deep learning neural network technology has been applied in various fields. The convolutional neural networks (CNNs) are good at extracting high-level abstract features of data and learning the potential distribution characteristics of data. The SR research based on deep learning has received extensive attention with the application of CNN in the image field [10]. Dong et al. [11] first proposed a SISR reconstruction method using super-resolution CNN. Still, the number of network layers in the network is small, the convergence is slower, and the convolution kernel is smaller. Furthermore, the extracted features are all local features that are difficult to recover in texture details, resulting in poor repetitive effects. Subsequently, Dong et al. [12] proposed a fast super-resolution convolutional neural network (FSRCNN), which replaced the 5 × 5 convolution kernel in SRCNN with two concatenated 3 × 3 convolution kernels to reduce parameters and increase the number of network layers.

Based on the SRCNN method, Shi and Caballero et al. [13] can convert low-resolution (LR) images into high-resolution (HR) images efficiently and in real-time by adding sub-pixel convolutional layers. He et al. [14] proposed deep residual learning for image recognition (ResNet), which solved the problem of gradient disappearance as the number of network layers increases. Kim et al. [15] proposed a very deep convolutional network (VDSR) based on the ResNet, using a deeper VGG network structure model, reaching a network depth of twenty layers, and significantly improving the network convergence speed through residual learning. Lim et al. [16] proposed an enhanced deep super-resolution network (EDSR), which removes the Batch Normalization (BN) layer in the residual block, saving 40% of the memory usage, and can build a larger model with better performance. Lai et al. [17] proposed Deep Laplacian Pyramid Networks for fast and accurate super-resolution (LapSRN), which replaced the L2 loss function with the Charbonnier loss function, conducted deep supervision training on the network, and achieved high-quality reconstruction.

In addition, inspired by the Generative Adversarial Network (GAN), Leding et al. [18] proposed a super-resolution generative adversarial network (SRGAN), which is the first to use GAN in the field of image SR, and improve the perception of images by enhancing the realism of some details. Wang et al. [19] proposed enhanced super-resolution generative adversarial networks (ESRGAN) based on the SRGAN, replacing residual blocks with dense blocks, and removing the BN layer, which significantly improved the reconstruction effect. The SR method based on the neural network has made significant progress, and researchers have applied it to the field of medical imaging. Hatvani et al. [20] used the U-Net network and sub-pixel network to perform super-resolution processing on 2D tooth computed tomography images, the model used the Mean Square Error (MSE) loss [21], and total variation regularization loss. This method achieved good results and helped experts better observe medical significance, such as the root canal's size, shape, and curvature.

Section snippets

SRGAN

The SRGAN is an algorithm model for SR reconstruction of images based on the GAN model, it consists of two parts: a generator and a discriminator. I^SR represents the SR image reconstructed by the SRGAN network, I^HR is an HR image, and I^LR is an LR image corresponding to I^HR, obtained by I^HR through down-sampling. G represents the generator network, D represents the discriminator network, and the schematic diagram of the SRGAN model is shown in Fig. 1.

The generator generates I^SR by inputting I^LR

Methodology

Owing to the excellent image simulation generation capability of the GAN network, it can help overcome the obstacles of existing deep learning in the field of retinal imaging, solve the limitation of data on retinal image analysis, and enable deep learning to be more widely used in the field of retinal imaging. At the same time, the unique discriminant network of the GAN network can help generate finer local details of the network output, which is critical for the processing of retinal image

Dataset and training details

The dataset used in the experiment is a high-quality 2K resolution image DIV2K dataset [29] newly proposed in recent years, containing 800 training images, 100 verification images, and 100 test images. However, since the test images have not been released yet, the Set5 [30], Set14 [31], and Urban100 [32] datasets are used here for testing. All experiments are performed between LR images and HR images for 4 × enlargement. Owing to the limitations of the experimental equipment, the image is

Conclusion

In this paper, we propose an improved generative adversarial network (IGAN) for retinal image SR reconstruction algorithm based on the SRGAN, which increases the high-frequency information of the image by constructing the residual block of the attention convolutional neural network. Furthermore, we remove the BN layer that affects the quality of retinal image generation in the residual block, improving the network's training speed. Then, we introduce the more robust Charbonnier to replace the

Ethical approval

No ethics approval is required.

Declaration of Competing Interest

The authors declare that they have no conflicts of interest.

Acknowledgment

This work was supported by the National Natural Science Foundation of China under Grant Nos. 61976215 and 62176259.

References (42)

M Badar et al.
Application of deep learning for retinal image analysis: a review
Comput. Sci. Rev.
(2020)
D Qiu et al.
Multiple improved residual networks for medical image super-resolution
Future Gener. Comput. Syst.
(2021)
K K L Wong et al.
Deep learning-based cardiovascular image diagnosis: a promising challenge
Future Gener. Comput. Syst.
(2020)
D Qiu et al.
Gradual back-projection residual attention network for magnetic resonance image super-resolution
Comput. Methods Programs Biomed.
(2021)
D Qiu et al.
End-to-end residual attention mechanism for cataractous retinal image dehazing
Comput. Methods Programs Biomed.
(2022)
D Qiu et al.
Dual U-Net residual networks for cardiac magnetic resonance images super-resolution
Comput. Methods Programs Biomed.
(2022)
D E Romo-Bucheli et al.
End-to-end deep learning model for predicting treatment requirements in neovascular amd from longitudinal retinal oct imaging
IEEE J. Biomed. Health Inf.
(2020)
X Yi et al.
Generative adversarial network in medical imaging: a review
IEEE Signal Process. Mag.
(2018)
D Peng et al.
SAM-GAN: self-attention supporting multi-stage generative adversarial networks for text-to-image synthesis
Neural Netw.
(2021)
W Yang et al.
Deep learning for single image super-resolution: a brief review
IEEE Trans. Multimedia
(2019)

R KEYS

Cubic convolution interpolation for digital image processing

IEEE Trans Acoust Speech Signal Process.

(1981)

T Dai et al.

Second-order attention network for single image super-resolution

C Dong et al.

Image super-resolution using deep convolutional networks

IEEE Trans. Pattern Anal. Mach. Intell.

(2016)

C Dong et al.

Accelerating the super-resolution convolutional neural network

W Shi et al.

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

K He et al.

Deep residual learning for image recognition

J Kim et al.

Accurate image super resolution using very deep convolutional networks

B Lim et al.

Enhanced deep residual networks for single image super-resolution

W S Lai et al.

Deep laplacian pyramid networks for fast and accurate super resolution

C Ledig et al.

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

(2017)

X Wang et al.

ESRGAN: enhanced super-resolution generative adversarial networks

Cited by (12)

Learn from orientation prior for radiograph super-resolution: Orientation operator transformer
2024, Computer Methods and Programs in Biomedicine
Background and objective: High-resolution radiographic images play a pivotal role in the early diagnosis and treatment of skeletal muscle-related diseases. It is promising to enhance image quality by introducing single-image super-resolution (SISR) model into the radiology image field. However, the conventional image pipeline, which can learn a mixed mapping between SR and denoising from the color space and inter-pixel patterns, poses a particular challenge for radiographic images with limited pattern features. To address this issue, this paper introduces a novel approach: Orientation Operator Transformer - $O^{2}$ former. Methods: We incorporate an orientation operator in the encoder to enhance sensitivity to denoising mapping and to integrate orientation prior. Furthermore, we propose a multi-scale feature fusion strategy to amalgamate features captured by different receptive fields with the directional prior, thereby providing a more effective latent representation for the decoder. Based on these innovative components, we propose a transformer-based SISR model, i.e., $O^{2}$ former, specifically designed for radiographic images. Results: The experimental results demonstrate that our method achieves the best or second-best performance in the objective metrics compared with the competitors at ×4 upsampling factor. For qualitative, more objective details are observed to be recovered. Conclusions: In this study, we propose a novel framework called $O^{2}$ former for radiological image super-resolution tasks, which improves the reconstruction model's performance by introducing an orientation operator and multi-scale feature fusion strategy. Our approach is promising to further promote the radiographic image enhancement field.
Perception-oriented generative adversarial network for retinal fundus image super-resolution
2024, Computers in Biology and Medicine
Retinal fundus imaging is a crucial diagnostic tool in ophthalmology, enabling the early detection and monitoring of various ocular diseases. However, capturing high-resolution fundus images often presents challenges due to factors such as defocusing and diffraction in the digital imaging process, limited shutter speed, sensor unit density, and random noise in the image sensor or during image transmission. Super-resolution techniques offer a promising solution to overcome these limitations and enhance the visual details in retinal fundus images. Since the retina has rich texture details, the super-resolution images often introduce artifacts into texture details and lose some fine retinal vessel structures. To improve the perceptual quality of the retinal fundus image, a generative adversarial network that consists of a generator and a discriminator is proposed. The proposed generator mainly comprises 23 multi-scale feature extraction blocks, an image segmentation network, and 23 residual-in-residual dense blocks. These components are employed to extract features at different scales, acquire the retinal vessel grayscale image, and extract retinal vascular features, respectively. The generator has two branches that are mainly responsible for extracting global features and vascular features, respectively. The extracted features from the two branches are fused to better restore the super-resolution image. The proposed generator can restore more details and more accurate fine vessel structures in retinal images. The improved discriminator is proposed by introducing our designed attention modules to help the generator yield clearer super-resolution images. Additionally, an artifact loss function is also introduced to enhance the generative adversarial network, enabling more accurate measurement of the disparity between the high-resolution image and the restored image. Experimental results show that the generated images obtained by our proposed method have a better perceptual quality than the state-of-the-art image super-resolution methods.
Increasing-Margin Adversarial (IMA) training to improve adversarial robustness of neural networks
2023, Computer Methods and Programs in Biomedicine
Background and Objective: Deep neural networks (DNNs) are vulnerable to adversarial noises. Adversarial training is a general and effective strategy to improve DNN robustness (i.e., accuracy on noisy data) against adversarial noises. However, DNN models trained by the current existing adversarial training methods may have much lower standard accuracy (i.e., accuracy on clean data), compared to the same models trained by the standard method on clean data, and this phenomenon is known as the trade-off between accuracy and robustness and is commonly considered unavoidable. This issue prevents adversarial training from being used in many application domains, such as medical image analysis, as practitioners do not want to sacrifice standard accuracy too much in exchange for adversarial robustness. Our objective is to lift (i.e., alleviate or even avoid) this trade-off between standard accuracy and adversarial robustness for medical image classification and segmentation.
Methods: We propose a novel adversarial training method, named Increasing-Margin Adversarial (IMA) Training, which is supported by an equilibrium state analysis about the optimality of adversarial training samples. Our method aims to preserve accuracy while improving robustness by generating optimal adversarial training samples. We evaluate our method and the other eight representative methods on six publicly available image datasets corrupted by noises generated by AutoAttack and white-noise attack.
Results: Our method achieves the highest adversarial robustness for image classification and segmentation with the smallest reduction in accuracy on clean data. For one of the applications, our method improves both accuracy and robustness.
Conclusions: Our study has demonstrated that our method can lift the trade-off between standard accuracy and adversarial robustness for the image classification and segmentation applications. To our knowledge, it is the first work to show that the trade-off is avoidable for medical image segmentation.
Learn Single-horizon Disease Evolution for Predictive Generation of Post-therapeutic Neovascular Age-related Macular Degeneration
2023, Computer Methods and Programs in Biomedicine
Most of the existing disease prediction methods in the field of medical image processing fall into two classes, namely image-to-category predictions and image-to-parameter predictions.Few works have focused on image-to-image predictions. Different from multi-horizon predictions in other fields, ophthalmologists prefer to show more confidence in single-horizon predictions due to the low tolerance of predictive risk.
We propose a single-horizon disease evolution network (SHENet) to predictively generate post-therapeutic SD-OCT images by inputting pre-therapeutic SD-OCT images with neovascular age-related macular degeneration (nAMD). In SHENet, a feature encoder converts the input SD-OCT images to deep features, then a graph evolution module predicts the process of disease evolution in high-dimensional latent space and outputs the predicted deep features, and lastly, feature decoder recovers the predicted deep features to SD-OCT images. We further propose an evolution reinforcement module to ensure the effectiveness of disease evolution learning and obtain realistic SD-OCT images by adversarial training.
SHENet is validated on 383 SD-OCT cubes of 22 nAMD patients based on three well-designed schemes (P-0, P-1 and P-M) based on the quantitative and qualitative evaluations. Three metrics (PSNR, SSIM, 1-LPIPS) are used here for quantitative evaluations. Compared with other generative methods, the generative SD-OCT images of SHENet have the highest image quality (P-0: 23.659, P-1: 23.875, P-M: 24.198) by PSNR. Besides, SHENet achieves the best structure protection (P-0: 0.326, P-1: 0.337, P-M: 0.349) by SSIM and content prediction (P-0: 0.609, P-1: 0.626, P-M: 0.642) by 1-LPIPS. Qualitative evaluations also demonstrate that SHENet has a better visual effect than other methods.
SHENet can generate post-therapeutic SD-OCT images with both high prediction performance and good image quality, which has great potential to help ophthalmologists forecast the therapeutic effect of nAMD.
Learn From Orientation Prior for Radiograph Super-Resolution: Orientation Operator Transformer
2023, arXiv
Recent advances in deep learning models: a systematic literature review
2023, Multimedia Tools and Applications

View all citing articles on Scopus

View full text

Improved generative adversarial network for retinal image super-resolution

Highlights

Abstract

Background and objective

Methods

Results

Conclusion

Introduction

Section snippets

SRGAN

Methodology

Dataset and training details

Conclusion

Ethical approval

Declaration of Competing Interest

Acknowledgment

Comput. Sci. Rev.

Future Gener. Comput. Syst.

Future Gener. Comput. Syst.

Comput. Methods Programs Biomed.

Comput. Methods Programs Biomed.

Comput. Methods Programs Biomed.

End-to-end deep learning model for predicting treatment requirements in neovascular amd from longitudinal retinal oct imaging

IEEE J. Biomed. Health Inf.

Generative adversarial network in medical imaging: a review

IEEE Signal Process. Mag.

SAM-GAN: self-attention supporting multi-stage generative adversarial networks for text-to-image synthesis

Neural Netw.

Deep learning for single image super-resolution: a brief review

IEEE Trans. Multimedia

Cubic convolution interpolation for digital image processing

IEEE Trans Acoust Speech Signal Process.

Second-order attention network for single image super-resolution

Image super-resolution using deep convolutional networks

IEEE Trans. Pattern Anal. Mach. Intell.

Accelerating the super-resolution convolutional neural network

Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network

Deep residual learning for image recognition

Accurate image super resolution using very deep convolutional networks

Enhanced deep residual networks for single image super-resolution

Deep laplacian pyramid networks for fast and accurate super resolution

Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network

ESRGAN: enhanced super-resolution generative adversarial networks