Skip to main content

Abstract

Accurate segmentation of the right ventricle (RV) in cardiac magnetic resonance (CMR) images is crucial for ventricular structure and function assessment. However, due to its variable anatomy and ill-defined borders, RV segmentation remains an open problem. While recent advances in deep learning show great promise in tackling these challenges, such methods are typically developed on homogeneous data-sets, not reflecting realistic clinical variation in image acquisition and pathology. In this work, we develop a model, aimed at segmenting all three cardiac structures in a multi-center, multi-disease and multi-view setting, using data provided by the M&Ms-2 challenge. We propose a pipeline addressing various aspects of segmenting heterogeneous data, consisting of heart region detection, augmentation through image synthesis and multi-fusion segmentation. Our extensive experiments demonstrate the importance of different elements of the pipeline, achieving competitive results for RV segmentation in both short-axis and long-axis MR images.

Y. Al Khalil and S. Amirrajab—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.ub.edu/mnms/.

References

  1. Abbasi-Sureshjani, S., Amirrajab, S., Lorenz, C., Weese, J., Pluim, J., Breeuwer, M.: 4D semantic cardiac magnetic resonance image synthesis on XCAT anatomical model. In: Medical Imaging with Deep Learning, pp. 6–18. PMLR (2020)

    Google Scholar 

  2. Amirrajab, S., et al.: XCAT-GAN for synthesizing 3D consistent labeled cardiac MR images on anatomically variable XCAT phantoms. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 128–137 (2020)

    Google Scholar 

  3. Attili, A.K., Schuster, A., Nagel, E., Reiber, J.H., van der Geest, R.J.: Quantification in cardiac MRI: advances in image acquisition and processing. Int. J. Cardiovasc. Imaging 26(1), 27–40 (2010)

    Article  Google Scholar 

  4. Avendi, M.R., Kheradvar, A., Jafarkhani, H.: Automatic segmentation of the right ventricle from cardiac MRI using a learning-based approach. Magn. Reson. Med. 78(6), 2439–2448 (2017)

    Article  Google Scholar 

  5. Bai, W., et al.: A probabilistic patch-based label fusion model for multi-atlas segmentation with registration refinement: application to cardiac mr images. IEEE Trans. Med. Imaging 32(7), 1302–1315 (2013)

    Article  Google Scholar 

  6. Campello, V.M., et al.: Multi-centre, multi-vendor and multi-disease cardiac segmentation: the M&MS challenge. IEEE Trans. Med. Imaging 40(12), 3543–3554 (2021)

    Article  Google Scholar 

  7. Caudron, J., Fares, J., Vivier, P.H., Lefebvre, V., Petitjean, C., Dacher, J.N.: Diagnostic accuracy and variability of three semi-quantitative methods for assessing right ventricular systolic function from cardiac mri in patients with acquired heart disease. Eur. Radiol. 21(10), 2111–2120 (2011)

    Article  Google Scholar 

  8. Chen, C., et al.: Deep learning for cardiac image segmentation: a review. Front. Cardiovasc. Med. 7, 25 (2020)

    Article  Google Scholar 

  9. Dolz, J., Desrosiers, C., Ayed, I.B.: IVD-net: intervertebral disc localization and segmentation in MRI with a multi-modal UNet. In: International Workshop and Challenge on Computational Methods and Clinical Applications for Spine Imaging, pp. 130–143 (2018)

    Google Scholar 

  10. Grosgeorge, D., Petitjean, C., Caudron, J., Fares, J., Dacher, J.N.: Automatic cardiac ventricle segmentation in MR images: a validation study. Int. J. Comput. Assist. Radiol. Surg. 6(5), 573–581 (2011)

    Article  Google Scholar 

  11. Grosgeorge, D., Petitjean, C., Dacher, J.N., Ruan, S.: Graph cut segmentation with a statistical shape model in cardiac MRI. Comput. Vis. Image Underst. 117(9), 1027–1035 (2013)

    Article  Google Scholar 

  12. Haddad, F., Hunt, S.A., Rosenthal, D.N., Murphy, D.J.: Right ventricular function in cardiovascular disease, part i: anatomy, physiology, aging, and functional assessment of the right ventricle. Circulation 117(11), 1436–1448 (2008)

    Article  Google Scholar 

  13. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  14. Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L.H., Aerts, H.J.: Artificial intelligence in radiology. Nat. Rev. Cancer 18(8), 500–510 (2018)

    Article  Google Scholar 

  15. Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)

    Article  Google Scholar 

  16. Li, B., Que, D.: Medical images denoising based on total variation algorithm. Procedia Environ. Sci. 8, 227–234 (2011)

    Article  Google Scholar 

  17. Marchesseau, S., Ho, J.X., Totman, J.J.: Influence of the short-axis cine acquisition protocol on the cardiac function evaluation: a reproducibility study. Eur. J. Radiol. Open 3, 60–66 (2016)

    Article  Google Scholar 

  18. Martin-Isla, C., et al.: Image-based cardiac diagnosis with machine learning: a review. Front. Cardiovasc. Med. 7, 1 (2020)

    Article  Google Scholar 

  19. Nyúl, L.G., Udupa, J.K., Zhang, X.: New variants of a method of MRI scale standardization. IEEE Trans. Med. Imaging 19(2), 143–150 (2000)

    Article  Google Scholar 

  20. Ou, Y., Doshi, J., Erus, G., Davatzikos, C.: Multi-atlas segmentation of the cardiac MR right ventricle. In: Proceedings of 3D Cardiovascular Imaging: A MICCAI Segmentation Challenge (2012)

    Google Scholar 

  21. Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)

    Google Scholar 

  22. Petitjean, C., Zuluaga, M.A., et al.: Right ventricle segmentation from cardiac MRI: a collation study. Med. Image Anal. 19(1), 187–202 (2015)

    Article  Google Scholar 

  23. Ringenberg, J., Deo, M., Devabhaktuni, V., Berenfeld, O., Boyers, P., Gold, J.: Fast, accurate, and fully automatic segmentation of the right ventricle in short-axis cardiac MRI. Comput. Med. Imaging Graph. 38(3), 190–201 (2014)

    Article  Google Scholar 

  24. Rumsfeld, J.S., Joynt, K.E., Maddox, T.M.: Big data analytics to improve cardiovascular care: promise and challenges. Nat. Rev. Cardiol. 13(6), 350 (2016)

    Article  Google Scholar 

  25. Scannell, C.M., et al.: Deep-learning-based preprocessing for quantitative myocardial perfusion MRI. J. Magn. Reson. Imaging 51(6), 1689–1696 (2020)

    Article  Google Scholar 

  26. Shameer, K., Johnson, K.W., Glicksberg, B.S., Dudley, J.T., Sengupta, P.P.: ML in cardiovascular medicine: are we there yet? Heart 104(14), 1156–1164 (2018)

    Article  Google Scholar 

  27. Simon, M.A.: Assessment and treatment of right ventricular failure. Nat. Rev. Cardiol. 10(4), 204–218 (2013)

    Article  Google Scholar 

  28. Wang, C.W., Peng, C.W., Chen, H.C.: A simple and fully automatic right ventricle segmentation method for 4-dimensional cardiac MR images. In: Proceedings of MICCAI RV Segmentation Challenge (2012)

    Google Scholar 

  29. Yan, W., Huang, L., Xia, L., et al.: MRI manufacturer shift and adaptation: increasing the generalizability of deep learning segmentation for MR images acquired with different scanners. Radiol. Artif. Intell. 2(4), e190195 (2020)

    Google Scholar 

  30. Yilmaz, P., Wallecan, K., Kristanto, W., Aben, J.P., Moelker, A.: Evaluation of a semi-automatic right ventricle segmentation method on short-axis MR images. J. Digit. Imaging 31(5), 670–679 (2018)

    Article  Google Scholar 

  31. Zuluaga, M.A., Cardoso, M.J., Modat, M., Ourselin, S.: Multi-atlas propagation whole heart segmentation from MRI and CTA using a local normalised correlation coefficient criterion. In: Ourselin, S., Rueckert, D., Smith, N. (eds.) FIMH 2013. LNCS, vol. 7945, pp. 174–181. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38899-6_21

    Chapter  Google Scholar 

Download references

Acknowledgments

This research is a part of the openGTN project, supported by the European Union in the Marie Curie Innovative Training Networks (ITN) fellowship program under project No. 764465.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yasmina Al Khalil .

Editor information

Editors and Affiliations

Appendices

Appendix A. Pre-processing Stage

1.1 A.1. Heart Detection module

As presented in Sect. 2.2, the first stage of our pipeline is a heart region detection module, consisting of a regression-based neural network that locates and extracts the heart in both SA and LA images, similar to the approach used in [25]. Before generating the training labels, we resample all SA images to a median spatial resolution of 1.25 \(\times \) 1.25 \(\times \) 10 mm\(^{3}\) and all LA images to a spatial resolution of 1.25 \(\times \) 1.25 before cropping. We use a simple CNN designed for a regression task, where the output consists of 6 continuous values. The inputs to the network are 2D (256 \(\times \) 256) mid-cavity SA slices extracted from the training data-set and all LA slices, respectively, normalized to have the intensity values in the range of [0,1]. The outputs consist of parameters that define the bounding box, namely x and y directions of the center of the initialized ROI and its lower left corner, as well as the scaling factors for the width and height of the initial ROI.

The CNN consists of five convolutional layers, followed by two fully-connected layers with a linear activation. Each convolutional layer uses 3 \(\times \) 3 kernels, followed by a 2 \(\times \) 2 max-pooling layer. Batch normalization and leaky ReLU activations are used in each layer, except for the output. Dropout with the probability of 0.5 is used in the fully connected layers. The network is trained for 2000 epochs with a batch size of 32 and early stopping (assessed from the validation accuracy), by minimizing the mean squared error between the computed transformation and the actual transformation (estimated from the ground-truth) using the Adam optimizer. We start with an initial learning rate of 0.001 but decrease it by a factor of 0.5 every 250 epochs. All image dimensions and scaling/displacement parameters are normalized in a way to generate translations that are in the range from −1 to 1.

After prediction, all the parameters are de-normalized to reflect the original image scale. On-the-fly data augmentation is applied to the training images, consisting of random translation, rotation, scaling, vertical and horizontal flips, contrast augmentation and addition of noise. At inference time, we again use mid-cavity slices from the SA test images to obtain the adjustment parameters of the ROI (not needed for LA). The predicted bounding boxes on mid-cavity slices of SA images are then propagated through the whole 3D volume, from which these slices were extracted. This procedure is not applied for LA images, where direct detection is possible (both ED and ES LA images consist of a single slice only). The obtained cropped SA and LA images using the predicted bounding box are post-processed to be of the size 128 \(\times \) 128 voxels and \(176 \times 176\) voxels, respectively. These images are then used for training the cardiac cavity segmentation and synthesis networks.

1.2 A.2. Appearance Transformations for Targeting Variation in Contrast and Intensity

One of the main challenges of deploying a segmentation algorithm on heterogeneous data is its performance in the presence of extensive contrast and intensity variations. By exploring the provided training and validation sets, we observe that not only the data acquired from different vendors varies in contrast, but that the presence of pathology largely influences proper tissue visibility and often occludes tissue boundaries. Applying image appearance transformations can help with improving both the contrast and tissue visibility, as well as put more emphasis on tissue shape, rather than appearance. To achieve this, we select a set of six transformations per image, where each is fed into a separate encoding path during the training of the late fusion model, namely:

  1. 1.

    Histogram standardization: We standardize the intensities of images to those representative of each scanner vendor, by utilizing the algorithm in [19], which detects the landmarks on image histograms in the training set and averages them to form a standard landmark set per vendor. When a new image is acquired, the detected landmarks of its histogram are then matched to the previously computed standard positions by linear interpolation of intensities between the landmarks. A similar approach is applied at inference time using landmarks calculated from the training data. Thus, for each image, we generate its three counterparts, standardized to the landmarks extracted from GE, Siemens and Philips-acquired images.

  2. 2.

    Edge preserving filtering: To emphasize the shape of the heart cavities and discard high frequency features, we apply total variation filtering (TVF) on the original input image. TVF is typically used for denoising and produces images with flat domain separated by enhanced edges [16].

  3. 3.

    Solarization and posterization: Solarization can be defined as “partial” inversion of light and dark intensity values, with the total solarization being the negative of the image. Posterization retains the general appearance of the image, but gradual transitions are replaced by abrupt changes in shading from one region to another. This emphasizes edges, flattens the image, and is typically used for contour tracing.

  4. 4.

    Laplacian filter: The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection.

Appendix B. Image Synthesis Models

Two identical image synthesis models are trained using LA and SA cardiac MR images. To augment and balance the data using these trained synthesis models, the following strategies are devised;

  1. i)

    For each vendor-specific subset, the outlier cases are identified based on the end-diastolic or end-systolic volume for the RV calculated using the ground truth label of the SA images. These outlier cases, separated from the rest of the population, are used for image synthesis by applying random label deformations. For balancing the ratio, we apply different number of deformations in a way that we eventually create 1000 synthesized cases for each vendor including 50% outliers and 50% the rest of cases.

  2. ii)

    For each subject, the ratio of the number of mid-ventricular and apical slices to the number of basal slices is not balanced in the SA stacks; there are typically 2–3 basal slices compared to 6–8 mid-ventricular and apical slices. The basal slices may not be frequently seen by the segmentation network during training compared to other slices. This could account for network failure on these challenging slices. To increase the occurrence of these examples, we utilize the labels of three most basal slices of all cases and randomly deform them 10 times for image synthesis.

Appendix C. Cardiac Cavity Segmentation Architecture and Training Procedure

The architecture of the late fusion U-Net segmentation model aims at learning separate convolutional encoder paths per each transformed image, whose features are fused at their higher layers (or the bottleneck). Here, we assume that higher-level representations from different transformations of each image are more complementary to each other, while containing distinctive features that aid the segmentation process. Each encoding path consists of five convolutional blocks, with four max-pooling layers. Each convolutional block consists of 3 \(\times \) 3 kernel convolutional layers, batch normalization and leaky ReLU activation. We apply batch normalization to improve regularization and help the network be less susceptible to noise and intensity variation. Moreover, we apply dropout regularization, with a rate of 0.5, after each concatenating operation to further avoid over-fitting.

To increase robustness and cover a wide range of variations in terms of heart pose and size, we additionally augment the training set by applying data augmentation. Namely, we apply random vertical and horizontal flips (p = 0.5), random rotation by integer multiples of \(\frac{\pi }{2}\) (p = 0.5), random scaling with a scale factor s \(\in \) [0.8, 1.2] (p = 0.2), random translations (p = 0.3) and mirroring (p = 0.5). All augmentations are applied on the fly during training. At inference time, besides normalization and in-plane re-sampling, we apply a set of six transformations to generate six images at the input to the model. After pre-processing, each encoding path is fed with batches of 144 128 \(\times \) 128 images for training for the SA segmentation model and batches of 64 256 \(\times \) 256 images for the LA model. We use a validation set to track the training progress and identify overfitting, where the same augmentation approach is applied to the validation set and the mean Dice score is calculated per each epoch. To train the network, we use a weighted sum of the categorical cross-entropy and Dice loss. We use Adam for optimization, with an initial learning rate 10\(^{-4}\) and a weight decay of 3 \(\cdot \) e\(^{-5}\). During training, the learning rate is reduced by a factor of 5 if the validation loss does not improve by at least 5 \(\cdot \) 10\(^{-3}\) for 50 epochs. We apply early stopping on the validation set to avoid overfitting and select the model with the highest accuracy. We train each model (LA and SA) using a five-fold cross-validation on the training cases and use them as an ensemble to predict on the validation or testing set. The training of all models runs for 1000 epochs.

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Al Khalil, Y., Amirrajab, S., Pluim, J., Breeuwer, M. (2022). Late Fusion U-Net with GAN-Based Augmentation for Generalizable Cardiac MRI Segmentation. In: Puyol Antón, E., et al. Statistical Atlases and Computational Models of the Heart. Multi-Disease, Multi-View, and Multi-Center Right Ventricular Segmentation in Cardiac MRI Challenge. STACOM 2021. Lecture Notes in Computer Science(), vol 13131. Springer, Cham. https://doi.org/10.1007/978-3-030-93722-5_39

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93722-5_39

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93721-8

  • Online ISBN: 978-3-030-93722-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics