Improving myocardial pathology segmentation with U-Net++ and EfficientSeg from multi-sequence cardiac magnetic resonance images

https://doi.org/10.1016/j.compbiomed.2022.106218Get rights and content

Highlights

  • An accurate coarse-to-fine method is proposed for myocardial pathology segmentation.

  • The MBConvBlock and bidirectional feature fusion are applied to improve efficiency.

  • The Focal loss is implemented for the difficult-to-segment regions.

  • The proposed solution outperforms the first-place method in the MyoPS 2020 challenge.

Abstract

Background:

Myocardial pathology segmentation plays an utmost role in the diagnosis and treatment of myocardial infarction (MI). However, manual segmentation is time-consuming and labor-intensive, and requires a lot of professional knowledge and clinical experience.

Methods:

In this work, we develop an automatic and accurate coarse-to-fine myocardial pathology segmentation framework based on the U-Net++ and EfficientSeg model. The U-Net++ network with deep supervision is first applied to delineate the cardiac structures from the multi-sequence cardiac magnetic resonance (CMR) images to generate a coarse segmentation map. Then the coarse segmentation map together with the three-sequence CMR data is sent to the EfficientSeg-B1 for fine segmentation, that is, further segmentation of myocardial scar and edema areas. In addition, we apply the Focal loss to replace the original cross-entropy term, for the purpose of encouraging the model to pay more attention to the pathological areas.

Results:

The proposed segmentation approach is tested on the public Myocardial Pathology Segmentation Challenge (MyoPS 2020) dataset. Experimental results demonstrate that our solution achieves an average Dice score of 0.7148 ± 0.2213 for scar, an average Dice score of 0.7439 ± 0.1011 for edema + scar, and the final average score of 0.7294 on the overall 20 testing sets, all of which have outperformed the first place method in the competition. Moreover, extensive ablation experiments are performed, which shows that the two-stage strategy with Focal loss greatly improves the segmentation quality of pathological areas.

Conclusion:

Given its effectiveness and superiority, our method can further facilitate myocardial pathology segmentation in medical practice.

Introduction

Heart disease has become one of the most serious diseases that threaten human health, and it generally includes congenital heart disease, hypertensive heart disease, myocardial infarction, myocarditis, and so on. Myocardial infarction (MI) is a disease in which the blood supply to the heart is reduced or interrupted, leading to necrosis of the heart with typical clinical symptoms such as dyspnea, chest pain, chest tightness, and so on. In year 2015, 15.9 million people worldwide suffered from myocardial infarction [1]. The typical clinical symptoms are dyspnea, chest pain, and chest tightness. In clinical practice, myocardial pathology segmentation is a key step in the diagnosis and quantification of MI. Traditional diagnosis of MI requires the cardiac CT or MRI images from the patients, together with some pathology reports. However, this is labor-intensive and time-consuming, heavily dependent on clinical experience of the doctors, which is also irreproducible. Therefore, it is of great importance to develop automated segmentation methods for myocardial pathology.

Image segmentation task is one of the bases of computer vision. In the literature, numerous traditional image segmentation algorithms have achieved satisfactory results in cardiac segmentation tasks, such as level set based methods [2], deformable models (e.g., snake) [3], eigenvalues of Hessian matrix based methods [4], etc. However, these algorithms require a lot of manual interventions and prior knowledge, and also have the disadvantages of strong model assumptions, too many parameters, and unrepeatable experiments [5], [6].

In recent years, deep learning based approaches have made great progress in the field of medical image processing, including segmentation of different parts and classification of the lesion areas [7], [8], [9], [10], [11]. Fully Convolutional Network (FCN) [12] achieves pixel-level classification of images, i.e., image semantic segmentation, by replacing the fully connected layers with deconvolution operations, while greatly reducing the number of model parameters. SegNet [13] adopts the encoder–decoder structure, retains the pooling position of the encoder, and directly re-stores its position when upsampling, which will better modify the edge information of the segmentation. U-Net [14] consists of a compression path and an expansion path, which fuses low-dimensional local features and high-dimensional global features through skip connections. U-net has far-reaching influence in the field of automatic segmentation of medical images. DeepLab [15] introduced dilated convolution, which greatly improved the model’s receptive field without increasing the number of parameters, at the same time, the conditional random field (CRF) was added, and the model performed well when dealing with boundaries.

Numerous solutions have been developed for the MyoPS 2020 challenge [16], [17], [18], [19], [20], [21], [22], [23], [24], [25]. For example, Zhai et al. [17] first detected the boundaries of LV and MYO with a simple network, then conducted the fine segmentation by using a four-channel input, which was composed of the predicted mask and the three-sequence images. Martín-Isla et al. [18] generated pseudo LGE images from bSSFP images by SPADE [19], and developed a localization-and-segmentation method. Ma [20] first used a 2D U-Net to segment the left ventricle and myocardium from the bSSFP sequence and cropped the corresponding ROIs, and then used two 2D U-Nets to segment the scar and edema from the ROIs of LGE and T2, respectively. Zhang et al. [21] proposed a two-stage segmentation framework named ASSN-PRSN, which was divided into two parts: anatomical structure segmentation and pathological region segmentation. Ankenbrand et al. [22] used 2D U-Nets (ResNet34 backbone) as the main segmentation network, selected pathology specific sequence data to segment myocardial scar and edema, and finally integrated the predictions of all models. Yu et al. [23] proposed the dual attention U-Net end-to-end segmentation network, which embedded two branches of channel attention and spatial attention at the end of the encoder of U-Net. Zhang et al. [24] proposed a simple but effective EfficientSeg network to achieve accurate end-to-end segmentation of myocardial pathology. Cui et al. [25] adopted a deep U-Net architecture together with curriculum learning.

However, existing algorithms in the literature still have obvious defects in the segmentation of CMR pathological areas, such as: (i) There is no ideal segmentation framework that can directly and automatically segment myocardial left ventricular scar and edema from multi-sequence CMR; (ii) To segment an organ image, it is necessary to know the arrangement between the organs. It turns out that even the neurons in the deepest layer of the U-Net have a receptive field of only 68 × 68 pixels. No part of the network can “see” the entire image, i.e., the model cannot adequately take into account knowledge of the anatomical structure and spatial pixel location information of the region to be segmented; (iii) Due to the extreme imbalance between the myocardial pathology area and the background, many areas of the image are easily judged as the background rather than the myocardial pathology areas, and the loss function and back-propagation cannot provide useful information, so many segmentation networks (such as U-Net) have poor segmentation performance on very small regions.

Therefore, given the shortcomings and deficiencies of the above methods, we propose a two-stage coarse-to-fine segmentation approach for the MyoPS 2020 challenge task. In the coarse segmentation stage, we first segment the left ventricular (LV) blood pool, right ventricular (RV) blood pool, LV myocardium (MYO) and background regions. Then in the fine segmentation stage, we regard the coarse segmentation map as prior knowledge and combine it with the original CMR sequences as input to segment the myocardial pathological regions, i.e., LV myocardial edema (edema) and LV myocardial scar (scar). The main innovations and contributions of our method are summarized as follows:

  • An accurate coarse-to-fine segmentation method is proposed for myocardial pathology segmentation. The U-Net++ network with deep supervision is first applied to delineate the cardiac structures from the multi-sequence CMR images to generate a coarse segmentation map. Then the coarse segmentation map together with the three-sequence CMR data is sent to the EfficientSeg for fine segmentation.

  • We use EfficientSeg-B1 as the fine segmentation baseline to increase the network’s depth, which greatly improves the receptive field compared to U-Net. At the same time, MBConvBlock is used in the EfficientSeg network encoder instead of the traditional convolution operation, and bidirectional feature fusion is implemented in the decoder, which greatly improves the model efficiency.

  • In view of the extreme imbalance between the myocardial pathological area and the background, the difficulty of segmenting small areas, and the difficulty of processing segmentation boundary details, we introduce the Focal loss [26] to improve the original loss function, which encourages the model to pay more attention to the difficult-to-segment regions.

  • Extensive experiments are conducted on the MyoPS 2020 challenge dataset. Our proposed solution outperforms the first-place method in the competition.

The rest of this paper is organized as follows: Section 2 provides details about the method, including data pre-processing, baseline model, our segmentation framework, loss function and experimental details. Section 3 presents experimental results and ablation studies. Section 4 gives a brief discussion of the proposed method. In Section 5, we conclude this work.

Section snippets

Data augmentation and data pre-processing

The MyoPS 2020 challenge [27], [28] is dedicated to myocardial pathology segmentation, which provides 25 cases of training (including validation) data and 20 cases of testing data. Each case of data consists of three sequences of CMR images (LGE, T2, and bSSFP). All three-sequence images are aligned using the MvMM algorithm [27], [28] and then sampled to the same pixel size. MyoPS 2020 challenge provides the segmentation ground truth of CMR, which divides CMR into LV, RV, MYO, edema, scar and

Dice

As one of the most commonly used evaluation metrics for segmentation tasks, the Dice score is used to measure the similarity between the prediction results of the model and the ground truth, as shown in Eq. (5), where Mpre and Mtrue represent the model predicted segmentation map and the ground truth, respectively. Dice is more sensitive to the segmentation of small regions, that is, when the predicted segmentation effect has the same improvement, the small area Dice fluctuates more. Therefore,

Discussion

Myocardial pathology segmentation plays an important role in the computer-aided diagnosis of cardiac diseases, particularly myocardial infarction. However, the performances of existing deep learning based methods are unsatisfactory due to the small area of myocardial pathology, poor image quality, and limited dataset size. To address the above issues, we propose an automatic coarse-to-fine myocardial pathology segmentation framework based on the EfficientSeg network.

There are still some

Conclusions

In this work, we propose a two-stage coarse-to-fine myocardial pathology segmentation framework. In the first stage, the U-Net++ network with deep supervision is first applied to delineate the cardiac structures from the multi-sequence CMR images to generate a coarse segmentation map. In the second stage, the coarse segmentation map together with the three-sequence CMR data is sent to the EfficientSeg-B1 for further segmentation of myocardial scar and edema areas. To encourage the model to pay

CRediT authorship contribution statement

Hengfei Cui: Designed the method, Prepared the manuscript. Yan Li: Designed the method, Prepared the manuscript. Lei Jiang: Performed the experiments. Yifan Wang: Performed the experiments. Yong Xia: Improved the results presentation. Yanning Zhang: Improved the results presentation.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

The study was supported in part by the National Natural Science Foundation of China under Grants 62271405 and 62171377, in part by the Fundamental Research Funds for the Central Universities under Grant 3102020QD1001, and in part by the Key Research and Development Program of Shaanxi Province, China under Grant 2022GY-084.

References (35)

  • VesalS. et al.

    Spatio-temporal multi-task learning for cardiac MRI left ventricle quantification

    IEEE J. Biomed. Health Inform.

    (2020)
  • LiuYueyun et al.

    Effective 3d boundary learning via a nonlocal deformable network

  • J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmenta-tion, in: Proceedings of the IEEE...
  • BadrinarayananV. et al.

    Segnet: A deep convolutional encoder–decoder architecture for image segmentation

    IEEE Trans. Pattern Anal. Mach. Intell.

    (2017)
  • RonnebergerO. et al.

    U-net: Convolutional networks for biomedical image segmentation

  • ChenL.C. et al.

    Semantic image segmentation with deep convolutional nets and fully connected crfs

    (2014)
  • Myocardial Pathology Segmentation Combining Multi-Sequence Cardiac Magnetic Resonance Images: First Challenge, MyoPS 2020, Held in Conjunction with MICCAI 2020, Lima, Peru, October 4, 2020, Proceedings

    (2020)
  • Cited by (9)

    View all citing articles on Scopus
    1

    Contributed equally to this work.

    View full text