Late Fusion U-Net with GAN-Based Augmentation for Generalizable Cardiac MRI Segmentation

Al Khalil, Yasmina; Amirrajab, Sina; Pluim, Josien; Breeuwer, Marcel

doi:10.1007/978-3-030-93722-5_39

Yasmina Al Khalil¹⁶,
Sina Amirrajab¹⁶,
Josien Pluim¹⁶ &
…
Marcel Breeuwer¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 13131))

Included in the following conference series:

International Workshop on Statistical Atlases and Computational Models of the Heart

Abstract

Accurate segmentation of the right ventricle (RV) in cardiac magnetic resonance (CMR) images is crucial for ventricular structure and function assessment. However, due to its variable anatomy and ill-defined borders, RV segmentation remains an open problem. While recent advances in deep learning show great promise in tackling these challenges, such methods are typically developed on homogeneous data-sets, not reflecting realistic clinical variation in image acquisition and pathology. In this work, we develop a model, aimed at segmenting all three cardiac structures in a multi-center, multi-disease and multi-view setting, using data provided by the M&Ms-2 challenge. We propose a pipeline addressing various aspects of segmenting heterogeneous data, consisting of heart region detection, augmentation through image synthesis and multi-fusion segmentation. Our extensive experiments demonstrate the importance of different elements of the pipeline, achieving competitive results for RV segmentation in both short-axis and long-axis MR images.

Y. Al Khalil and S. Amirrajab—Contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

GAN for Augmenting Cardiac MRI Segmentation

Cardiac MRI Left Ventricular Segmentation and Function Quantification Using Pre-trained Neural Networks

Robust Cardiac MRI Segmentation with Data-Centric Models to Improve Performance via Intensive Pre-training and Augmentation

Notes

1.
https://www.ub.edu/mnms/.

References

Abbasi-Sureshjani, S., Amirrajab, S., Lorenz, C., Weese, J., Pluim, J., Breeuwer, M.: 4D semantic cardiac magnetic resonance image synthesis on XCAT anatomical model. In: Medical Imaging with Deep Learning, pp. 6–18. PMLR (2020)
Google Scholar
Amirrajab, S., et al.: XCAT-GAN for synthesizing 3D consistent labeled cardiac MR images on anatomically variable XCAT phantoms. In: International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 128–137 (2020)
Google Scholar
Attili, A.K., Schuster, A., Nagel, E., Reiber, J.H., van der Geest, R.J.: Quantification in cardiac MRI: advances in image acquisition and processing. Int. J. Cardiovasc. Imaging 26(1), 27–40 (2010)
Article Google Scholar
Avendi, M.R., Kheradvar, A., Jafarkhani, H.: Automatic segmentation of the right ventricle from cardiac MRI using a learning-based approach. Magn. Reson. Med. 78(6), 2439–2448 (2017)
Article Google Scholar
Bai, W., et al.: A probabilistic patch-based label fusion model for multi-atlas segmentation with registration refinement: application to cardiac mr images. IEEE Trans. Med. Imaging 32(7), 1302–1315 (2013)
Article Google Scholar
Campello, V.M., et al.: Multi-centre, multi-vendor and multi-disease cardiac segmentation: the M&MS challenge. IEEE Trans. Med. Imaging 40(12), 3543–3554 (2021)
Article Google Scholar
Caudron, J., Fares, J., Vivier, P.H., Lefebvre, V., Petitjean, C., Dacher, J.N.: Diagnostic accuracy and variability of three semi-quantitative methods for assessing right ventricular systolic function from cardiac mri in patients with acquired heart disease. Eur. Radiol. 21(10), 2111–2120 (2011)
Article Google Scholar
Chen, C., et al.: Deep learning for cardiac image segmentation: a review. Front. Cardiovasc. Med. 7, 25 (2020)
Article Google Scholar
Dolz, J., Desrosiers, C., Ayed, I.B.: IVD-net: intervertebral disc localization and segmentation in MRI with a multi-modal UNet. In: International Workshop and Challenge on Computational Methods and Clinical Applications for Spine Imaging, pp. 130–143 (2018)
Google Scholar
Grosgeorge, D., Petitjean, C., Caudron, J., Fares, J., Dacher, J.N.: Automatic cardiac ventricle segmentation in MR images: a validation study. Int. J. Comput. Assist. Radiol. Surg. 6(5), 573–581 (2011)
Article Google Scholar
Grosgeorge, D., Petitjean, C., Dacher, J.N., Ruan, S.: Graph cut segmentation with a statistical shape model in cardiac MRI. Comput. Vis. Image Underst. 117(9), 1027–1035 (2013)
Article Google Scholar
Haddad, F., Hunt, S.A., Rosenthal, D.N., Murphy, D.J.: Right ventricular function in cardiovascular disease, part i: anatomy, physiology, aging, and functional assessment of the right ventricle. Circulation 117(11), 1436–1448 (2008)
Article Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Hosny, A., Parmar, C., Quackenbush, J., Schwartz, L.H., Aerts, H.J.: Artificial intelligence in radiology. Nat. Rev. Cancer 18(8), 500–510 (2018)
Article Google Scholar
Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnU-Net: a self-configuring method for deep learning-based biomedical image segmentation. Nat. Methods 18(2), 203–211 (2021)
Article Google Scholar
Li, B., Que, D.: Medical images denoising based on total variation algorithm. Procedia Environ. Sci. 8, 227–234 (2011)
Article Google Scholar
Marchesseau, S., Ho, J.X., Totman, J.J.: Influence of the short-axis cine acquisition protocol on the cardiac function evaluation: a reproducibility study. Eur. J. Radiol. Open 3, 60–66 (2016)
Article Google Scholar
Martin-Isla, C., et al.: Image-based cardiac diagnosis with machine learning: a review. Front. Cardiovasc. Med. 7, 1 (2020)
Article Google Scholar
Nyúl, L.G., Udupa, J.K., Zhang, X.: New variants of a method of MRI scale standardization. IEEE Trans. Med. Imaging 19(2), 143–150 (2000)
Article Google Scholar
Ou, Y., Doshi, J., Erus, G., Davatzikos, C.: Multi-atlas segmentation of the cardiac MR right ventricle. In: Proceedings of 3D Cardiovascular Imaging: A MICCAI Segmentation Challenge (2012)
Google Scholar
Park, T., Liu, M.Y., Wang, T.C., Zhu, J.Y.: Semantic image synthesis with spatially-adaptive normalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2337–2346 (2019)
Google Scholar
Petitjean, C., Zuluaga, M.A., et al.: Right ventricle segmentation from cardiac MRI: a collation study. Med. Image Anal. 19(1), 187–202 (2015)
Article Google Scholar
Ringenberg, J., Deo, M., Devabhaktuni, V., Berenfeld, O., Boyers, P., Gold, J.: Fast, accurate, and fully automatic segmentation of the right ventricle in short-axis cardiac MRI. Comput. Med. Imaging Graph. 38(3), 190–201 (2014)
Article Google Scholar
Rumsfeld, J.S., Joynt, K.E., Maddox, T.M.: Big data analytics to improve cardiovascular care: promise and challenges. Nat. Rev. Cardiol. 13(6), 350 (2016)
Article Google Scholar
Scannell, C.M., et al.: Deep-learning-based preprocessing for quantitative myocardial perfusion MRI. J. Magn. Reson. Imaging 51(6), 1689–1696 (2020)
Article Google Scholar
Shameer, K., Johnson, K.W., Glicksberg, B.S., Dudley, J.T., Sengupta, P.P.: ML in cardiovascular medicine: are we there yet? Heart 104(14), 1156–1164 (2018)
Article Google Scholar
Simon, M.A.: Assessment and treatment of right ventricular failure. Nat. Rev. Cardiol. 10(4), 204–218 (2013)
Article Google Scholar
Wang, C.W., Peng, C.W., Chen, H.C.: A simple and fully automatic right ventricle segmentation method for 4-dimensional cardiac MR images. In: Proceedings of MICCAI RV Segmentation Challenge (2012)
Google Scholar
Yan, W., Huang, L., Xia, L., et al.: MRI manufacturer shift and adaptation: increasing the generalizability of deep learning segmentation for MR images acquired with different scanners. Radiol. Artif. Intell. 2(4), e190195 (2020)
Google Scholar
Yilmaz, P., Wallecan, K., Kristanto, W., Aben, J.P., Moelker, A.: Evaluation of a semi-automatic right ventricle segmentation method on short-axis MR images. J. Digit. Imaging 31(5), 670–679 (2018)
Article Google Scholar
Zuluaga, M.A., Cardoso, M.J., Modat, M., Ourselin, S.: Multi-atlas propagation whole heart segmentation from MRI and CTA using a local normalised correlation coefficient criterion. In: Ourselin, S., Rueckert, D., Smith, N. (eds.) FIMH 2013. LNCS, vol. 7945, pp. 174–181. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-38899-6_21
Chapter Google Scholar

Download references

Acknowledgments

This research is a part of the openGTN project, supported by the European Union in the Marie Curie Innovative Training Networks (ITN) fellowship program under project No. 764465.

Author information

Authors and Affiliations

Eindhoven University of Technology, Eindhoven, The Netherlands
Yasmina Al Khalil, Sina Amirrajab, Josien Pluim & Marcel Breeuwer

Authors

Yasmina Al Khalil
View author publications
You can also search for this author in PubMed Google Scholar
Sina Amirrajab
View author publications
You can also search for this author in PubMed Google Scholar
Josien Pluim
View author publications
You can also search for this author in PubMed Google Scholar
Marcel Breeuwer
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yasmina Al Khalil .

Editor information

Editors and Affiliations

King’s College London, London, UK
Esther Puyol Antón
Sunnybrook Research Institute, Toronto, Canada
Mihaela Pop
Universitat de Barcelona, BCN-AIM Artificial Intelligence in Medicine Lab, Barcelona, Spain
Carlos Martín-Isla
Inria - Epione Group, Sophia Antipolis, France
Maxime Sermesant
King’s College London, London, UK
Avan Suinesiaputra
Pompeu Fabra University, Barcelona, Spain
Oscar Camara
University of Barcelona, Barcelona, Spain
Karim Lekadir
King’s College London, London, UK
Alistair Young

Appendices

Appendix A. Pre-processing Stage

1.1 A.1. Heart Detection module

As presented in Sect. 2.2, the first stage of our pipeline is a heart region detection module, consisting of a regression-based neural network that locates and extracts the heart in both SA and LA images, similar to the approach used in [25]. Before generating the training labels, we resample all SA images to a median spatial resolution of 1.25 $\times $ 1.25 $\times $ 10 mm$^{3}$ and all LA images to a spatial resolution of 1.25 $\times $ 1.25 before cropping. We use a simple CNN designed for a regression task, where the output consists of 6 continuous values. The inputs to the network are 2D (256 $\times $ 256) mid-cavity SA slices extracted from the training data-set and all LA slices, respectively, normalized to have the intensity values in the range of [0,1]. The outputs consist of parameters that define the bounding box, namely x and y directions of the center of the initialized ROI and its lower left corner, as well as the scaling factors for the width and height of the initial ROI.

The CNN consists of five convolutional layers, followed by two fully-connected layers with a linear activation. Each convolutional layer uses 3 $\times $ 3 kernels, followed by a 2 $\times $ 2 max-pooling layer. Batch normalization and leaky ReLU activations are used in each layer, except for the output. Dropout with the probability of 0.5 is used in the fully connected layers. The network is trained for 2000 epochs with a batch size of 32 and early stopping (assessed from the validation accuracy), by minimizing the mean squared error between the computed transformation and the actual transformation (estimated from the ground-truth) using the Adam optimizer. We start with an initial learning rate of 0.001 but decrease it by a factor of 0.5 every 250 epochs. All image dimensions and scaling/displacement parameters are normalized in a way to generate translations that are in the range from −1 to 1.

After prediction, all the parameters are de-normalized to reflect the original image scale. On-the-fly data augmentation is applied to the training images, consisting of random translation, rotation, scaling, vertical and horizontal flips, contrast augmentation and addition of noise. At inference time, we again use mid-cavity slices from the SA test images to obtain the adjustment parameters of the ROI (not needed for LA). The predicted bounding boxes on mid-cavity slices of SA images are then propagated through the whole 3D volume, from which these slices were extracted. This procedure is not applied for LA images, where direct detection is possible (both ED and ES LA images consist of a single slice only). The obtained cropped SA and LA images using the predicted bounding box are post-processed to be of the size 128 $\times $ 128 voxels and $176 \times 176$ voxels, respectively. These images are then used for training the cardiac cavity segmentation and synthesis networks.

1.2 A.2. Appearance Transformations for Targeting Variation in Contrast and Intensity

One of the main challenges of deploying a segmentation algorithm on heterogeneous data is its performance in the presence of extensive contrast and intensity variations. By exploring the provided training and validation sets, we observe that not only the data acquired from different vendors varies in contrast, but that the presence of pathology largely influences proper tissue visibility and often occludes tissue boundaries. Applying image appearance transformations can help with improving both the contrast and tissue visibility, as well as put more emphasis on tissue shape, rather than appearance. To achieve this, we select a set of six transformations per image, where each is fed into a separate encoding path during the training of the late fusion model, namely:

1.
Histogram standardization: We standardize the intensities of images to those representative of each scanner vendor, by utilizing the algorithm in [19], which detects the landmarks on image histograms in the training set and averages them to form a standard landmark set per vendor. When a new image is acquired, the detected landmarks of its histogram are then matched to the previously computed standard positions by linear interpolation of intensities between the landmarks. A similar approach is applied at inference time using landmarks calculated from the training data. Thus, for each image, we generate its three counterparts, standardized to the landmarks extracted from GE, Siemens and Philips-acquired images.
2.
Edge preserving filtering: To emphasize the shape of the heart cavities and discard high frequency features, we apply total variation filtering (TVF) on the original input image. TVF is typically used for denoising and produces images with flat domain separated by enhanced edges [16].
3.
Solarization and posterization: Solarization can be defined as “partial” inversion of light and dark intensity values, with the total solarization being the negative of the image. Posterization retains the general appearance of the image, but gradual transitions are replaced by abrupt changes in shading from one region to another. This emphasizes edges, flattens the image, and is typically used for contour tracing.
4.
Laplacian filter: The Laplacian of an image highlights regions of rapid intensity change and is therefore often used for edge detection.

Appendix B. Image Synthesis Models

Two identical image synthesis models are trained using LA and SA cardiac MR images. To augment and balance the data using these trained synthesis models, the following strategies are devised;

i)
For each vendor-specific subset, the outlier cases are identified based on the end-diastolic or end-systolic volume for the RV calculated using the ground truth label of the SA images. These outlier cases, separated from the rest of the population, are used for image synthesis by applying random label deformations. For balancing the ratio, we apply different number of deformations in a way that we eventually create 1000 synthesized cases for each vendor including 50% outliers and 50% the rest of cases.
ii)
For each subject, the ratio of the number of mid-ventricular and apical slices to the number of basal slices is not balanced in the SA stacks; there are typically 2–3 basal slices compared to 6–8 mid-ventricular and apical slices. The basal slices may not be frequently seen by the segmentation network during training compared to other slices. This could account for network failure on these challenging slices. To increase the occurrence of these examples, we utilize the labels of three most basal slices of all cases and randomly deform them 10 times for image synthesis.

Appendix C. Cardiac Cavity Segmentation Architecture and Training Procedure

The architecture of the late fusion U-Net segmentation model aims at learning separate convolutional encoder paths per each transformed image, whose features are fused at their higher layers (or the bottleneck). Here, we assume that higher-level representations from different transformations of each image are more complementary to each other, while containing distinctive features that aid the segmentation process. Each encoding path consists of five convolutional blocks, with four max-pooling layers. Each convolutional block consists of 3 $\times $ 3 kernel convolutional layers, batch normalization and leaky ReLU activation. We apply batch normalization to improve regularization and help the network be less susceptible to noise and intensity variation. Moreover, we apply dropout regularization, with a rate of 0.5, after each concatenating operation to further avoid over-fitting.

To increase robustness and cover a wide range of variations in terms of heart pose and size, we additionally augment the training set by applying data augmentation. Namely, we apply random vertical and horizontal flips (p = 0.5), random rotation by integer multiples of $\frac{\pi }{2}$ (p = 0.5), random scaling with a scale factor s $\in $ [0.8, 1.2] (p = 0.2), random translations (p = 0.3) and mirroring (p = 0.5). All augmentations are applied on the fly during training. At inference time, besides normalization and in-plane re-sampling, we apply a set of six transformations to generate six images at the input to the model. After pre-processing, each encoding path is fed with batches of 144 128 $\times $ 128 images for training for the SA segmentation model and batches of 64 256 $\times $ 256 images for the LA model. We use a validation set to track the training progress and identify overfitting, where the same augmentation approach is applied to the validation set and the mean Dice score is calculated per each epoch. To train the network, we use a weighted sum of the categorical cross-entropy and Dice loss. We use Adam for optimization, with an initial learning rate 10$^{-4}$ and a weight decay of 3 $\cdot $ e$^{-5}$. During training, the learning rate is reduced by a factor of 5 if the validation loss does not improve by at least 5 $\cdot $ 10$^{-3}$ for 50 epochs. We apply early stopping on the validation set to avoid overfitting and select the model with the highest accuracy. We train each model (LA and SA) using a five-fold cross-validation on the training cases and use them as an ensemble to predict on the validation or testing set. The training of all models runs for 1000 epochs.

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Al Khalil, Y., Amirrajab, S., Pluim, J., Breeuwer, M. (2022). Late Fusion U-Net with GAN-Based Augmentation for Generalizable Cardiac MRI Segmentation. In: Puyol Antón, E., et al. Statistical Atlases and Computational Models of the Heart. Multi-Disease, Multi-View, and Multi-Center Right Ventricular Segmentation in Cardiac MRI Challenge. STACOM 2021. Lecture Notes in Computer Science(), vol 13131. Springer, Cham. https://doi.org/10.1007/978-3-030-93722-5_39

Download citation

DOI: https://doi.org/10.1007/978-3-030-93722-5_39
Published: 14 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-93721-8
Online ISBN: 978-3-030-93722-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Late Fusion U-Net with GAN-Based Augmentation for Generalizable Cardiac MRI Segmentation