A novel 3D medical image super-resolution method based on densely connected network

https://doi.org/10.1016/j.bspc.2020.102120Get rights and content

Abstract

High-quality and high-resolution medical images can help doctors make more accurate diagnoses, but the resolution of medical images is often limited by a variety of factors such as device, operation and compression rate. To deal with this issue, in this paper, we propose a novel densely connected network for super-resolution reconstruction of 3D medical images. In order to obtain multiscale information, we first adopt 3D dilated convolution with different dilation rates to extract shallow features. To better handle these hierarchical features, we combine local residual learning with densely connected layers, which apply 3D asymmetric convolution to improve performance without increasing inference time. Meanwhile, an improved attention module, which considers both channel-wise and spatial information, is applied to enhance attention of the channels and regions with more high-frequency details. Finally, a feature fusion module which contains three parallel dilated convolution is applied to fuse hierarchical features. Compared with the state-of-the-art methods, such as SRCNN, FSRCNN, SRResnet, DCSRN, ReCNN and DCED, our experimental results show that the proposed method has better performance in both objective metrics and visual effect.

Introduction

With the development of computer vision technology, medical images play a vital role in clinical medical applications. For example, magnetic resonance imaging (MRI) has been widely used in clinic. MRI offers high resolution in vivo imaging and rich functional and anatomical multimodality tissue contrast [1]. It is mainly used for soft tissue [2] and can directly get tomographic images of transverse plane, sagittal plane, coronal plane and various inclined plane. However, the hardware upgrade of MRI relies on physics, which cause a long cycle to carry out technological innovation [2]. In addition to the limitations of physical, technical and economic factors [3], the acquisition time of the MRI and the motility of organs also diminish the image quality. Low resolution images may reduce the visibility of important pathological details and further affect the diagnostic results [3]. Therefore, improving the resolution of medical images is an important research topic and many studies have shown that super-resolution (SR) can provide a relatively cheaper solution.

Image super-resolution (SR) reconstruction is a typical ill-posed problem [4] and its main purpose is to reconstruct high-resolution (HR) images from low-resolution (LR) images. A variety of methods have been proposed. These methods can be roughly divided into three categories: interpolation-based methods, reconstruction-based methods and learning-based methods [2].

The interpolation-based methods generally include nearest neighbor interpolation, bilinear interpolation [5], bicubic interpolation [6], and some subsequent methods [7], [8]. These traditional methods are simple and fairly effective, but regularly fail to restore more high-frequency information. In the reconstruction-based methods, P. Purkait et al. [9] proposed the maximum a posterior probability (MAP) method to constrain the solution space using prior information. However, there is little prior information available when the input image size is small.

The traditional machine learning-based methods can be categorized into dictionary learning methods, regression methods and sparse based methods. Rueda et al. [10] and Bhatia et al. [11] proposed methods based on coupling dictionary learning to learn HR and LR dictionaries from MRI to generate SR images. These methods depend on learning dictionaries on external LR–HR patches and benefit from sparse constraints to express the relationship between LR and HR images [12]. In the regression methods, many models are directly used to predict some pixels in HR images. Wu et al. [13] solved the mapping error problem by using kernel partial least squares regression model. In order to reduce the calculation expense, the literatures [14], [15], [16] obtained the closed-form solution of SR process by ridge regression which uses L2-norm to regulate sparse coefficients. Yang et al. [17] first used sparse coding (SC) to solve the super-resolution problem and further improved the performance and made the dictionary simpler. Subsequently, Tian, Yang, Ying, Liu, Ben et al. [18], [19], [20], [21], [22] proposed several methods based on the sparse method and obtained better results. Wei et al. [16] proposed a sparse medical image super-resolution method. With the rapid development of deep learning, many deep learning-based methods have been proposed. Dong et al. [23] introduced a super-resolution method using convolutional neural network (SRCNN) which includes feature extraction, nonlinear mapping and reconstruction for the first time. Later, various works have been proposed to improve SR performance via residual learning [24], [25], recursive learning [25], [26]. However, these methods are more time-consuming because the input size is the same as the final output size. FSRCNN [27] extracted upsampled spatial resolution only at the end of the processing pipeline via a deconvolution layer. Ledig et al. [28] designed a network structure named SRResnet with 16 residual blocks, which was further improved by EDSR [29] via removing the BN layer and using the residual scaling to speedup the training process. The literature [30] introduced dense neural network to the field of image super-resolution and proposed the super-resolution dense network (SRDensNet), which can avoid gradient vanishing, enhance feature propagation, support feature reuse, and reduce parameters.

For the super-resolution reconstruction of 3D medical images, Chen et al. [31] proposed a simple densely connected network (DCSRN) for 3D brain MRI. Pham et al. [32] proposed a deep 3D convolution neural network (ReCNN) for super-resolution of brain MRI data. Du et al. [33] proposed a dilated encoder–decoder network (DCED) to reconstruct high-resolution MRI. These methods generally focus on designing a deeper or wider network to learn more features. Yet, it is a great challenge to restore high-frequency details and fully utilize hierarchical features.

In this paper, we propose a super-resolution method for 3D medical images based on the densely connected layers. The experimental results show that our method has superior performance in both the objective metrics and the visual effect. The main contributions of our method can be summarized as described below.

  • The 3D dilated convolution module (DCM) with different dilation rates is adopted to increase the receptive field and obtain multiscale information without extra parameters.

  • We introduce a 3D channel-wise and spatial attention module (CSAM) to focus on the more important features and to improve the learning ability of the model.

  • We propose a local residual dense attention module (LRDAM) which includes a bottleneck layer, a residual dense module (RDM) and a CSAM. The bottleneck layer can reduce the data dimension so as to further reduce parameters. The RDM can merge hierarchical features so as to further enhance the learning ability of the network. Besides, local residual learning not only transfers abundant image details to the back layers, but also helps gradient flow which can simplify the training of the deep network.

  • The 3D asymmetric convolution (AC) is adopted in place of standard convolution. Due to the additivity of convolution, we can fuse 3D asymmetric convolutions into standard convolution before testing, which improve the performance without increasing the time of inference.

  • The feature fusion module (FFM), which includes parallel dilated convolution, is used at the end of the network to merge hierarchical features.

The rest of this paper is organized as follows: Section 2 reviews the present work including CNN-based SR models, MRI super-resolution and attention mechanism. Section 3 describes the structure of our proposed network and the key components in detail. Section 4 discusses the experimental results. Finally, Section 5 concludes the paper.

Section snippets

CNN-based SR models

In recent years, with the development of deep learning, many CNN-based SR models have been proposed. Dong et al. [23] introduced a three-layer convolutional network called SRCNN which implements end-to-end learning and achieves better performance than traditional methods. However, SRCNN can only restore limited information, convergence slowly and lose multiscale features. Later, Kim et al. [24], [25] used global residual learning to train a deeper network which contains 20 convolutional layers,

Network structure

As shown in Fig. 1, our network mainly consists of four parts: shallow feature extraction module (SFEM), dilated convolution module (DCM), local residual dense attention module (LRDAM) and feature fusion module (FFM). ILR and ISR are the input and output of the network respectively. First, we use a convolutional layer to extract shallow features. This process can be denoted as I0=HSFEM(ILR),where HSFEM() represents 3 × 3 × 3 convolution operation. Then, the shallow feature I0 is fed to the DCM

Experimental results

In this section, we first introduce the datasets we used and the preprocessing method. Second, we provide the implementation details including the settings of the experimental environment and the parameters of the network. Third, we compare the different performances of various component combinations. Finally, we compare our method with the state-of-the-art methods.

Conclusion

For 3D medical image super-resolution, the main problem is that the models are lack of the distinction ability to deal with hierarchical features. In this paper, we propose a novel method for 3D medical image super-resolution based on the densely connected layers. Our network mainly consists of four parts: shallow feature extraction module (SFEM), dilated convolution module (DCM), local residual dense attention module (LRDAM), and feature fusion module (FFM). In DCM, 3D dilated convolution with

CRediT authorship contribution statement

Wei Lu: Conceptualization, Methodology, Investigation. Zhijin Song: Software, Validation, Writing - original draft. Jinghui Chu: Writing - review & editing.

Declaration of Competing Interest

No author associated with this paper has disclosed any potential or pertinent conflicts which may be perceived to have impending conflict with this work. For full disclosure statements refer to https://doi.org/10.1016/j.bspc.2020.102120.

Acknowledgment

This work was supported in part by the National Natural Science Foundation of China under Grant 61802277 and China Postdoctoral Science Foundation Funded Project (2019M651038).

References (50)

  • YangX. et al.

    Super-resolution of medical image using representation learning

  • LiF. et al.

    Detail-preserving image super-resolution via recursively dilated residual network

    Neurocomputing

    (2019)
  • Van EssenD.C. et al.

    The WU-Minn human connectome project: An overview

    Neuroimage

    (2013)
  • HuangY. et al.

    Simultaneous super-resolution and cross-modality synthesis in magnetic resonance imaging

  • ZhuJ. et al.

    How can we make gan perform better in single medical image super-resolution? a lesion focused multi-scale approach

  • KeysR.

    Cubic convolution interpolation for digital image processing

    IEEE Trans. Acoust. Speech Signal Process.

    (1981)
  • LiX. et al.

    New edge-directed interpolation

    IEEE Trans. Image Process.

    (2001)
  • ShaoL. et al.

    Order statistic filters for image interpolation

  • ZhangL. et al.

    An edge-guided image interpolation algorithm via directional filtering and data fusion

    IEEE Trans. Image Process.

    (2006)
  • PurkaitP. et al.

    Super resolution image reconstruction through Bregman iteration using morphologic regularization

    IEEE Trans. Image Process.

    (2012)
  • BhatiaK.K. et al.

    Super-resolution reconstruction of cardiac MRI using coupled dictionary learning

  • R. Timofte, V. De Smet, L. Van Gool, Anchored neighborhood regression for fast example-based super-resolution, in:...
  • DaiD. et al.

    Jointly optimized regressors for image super-resolution

  • YangJ. et al.

    Image super-resolution via sparse representation

    IEEE Trans. Image Process.

    (2010)
  • YangX. et al.

    Multi-sensor image super-resolution with fuzzy cluster by using multi-scale and multi-view sparse coding for infrared image

    Multimedia Tools Appl.

    (2017)
  • Cited by (12)

    • A novel fuzzy hierarchical fusion attention convolution neural network for medical image super-resolution reconstruction

      2023, Information Sciences
      Citation Excerpt :

      Qiu et al. [35] proposed an efficient medical image SR (EMISR) method that achieved a superior SR effect on knee magnetic resonance imaging (MRI) images. Lu et al. [36] proposed a novel densely connected network for SR reconstruction of 3D medical images. The network used a 3D expanded convolution module to increase the receiver field and obtain multiscale information.

    • Double paths network with residual information distillation for improving lung CT image super resolution

      2022, Biomedical Signal Processing and Control
      Citation Excerpt :

      EMISR adopts a network structure combining SRCNN and ESPCN to achieve better SR results on knee magnetic resonance imaging (MRI) images. Wei Lu et al. [29] proposed a novel densely connected network for SR reconstruction of 3D medical images. The network utilizes a 3D dilated convolution module to increase the receptive field and obtain multi-scale information.

    View all citing articles on Scopus
    View full text