Abstract
We present a methodology for reconstructing full-cycle respiratory and cardiac gated short-axis cine MR sequences from real-time MR data. For patients who are too ill or otherwise incapable of consistent breath holds, real-time MR sequences are the preferred means of acquiring cardiac images, but suffer from inferior image quality compared to standard short-axis sequences and lack cardiac ECG gating. To construct a sequence from real-time images which, as close as possible, replicates the characteristics of short-axis series, the phase of the cardiac cycle must be estimated for each image and the left ventricle identified, to be used as a landmark for slice re-alignment. Our method employs CNN-based deep learning to segment the left ventricle in the real-time sequence, which is then used to estimate the pool volume and thus the position of each image in the cardiac cycle. We then use manifold learning to account for the respiratory cycle so as to select images of the best quality at expiration. From these images a selection is made to automatically reconstruct a single cardiac cycle, and the images and segmentations are then aligned. The aligned pool segmentations can then be used to calculate volume over time and thus volume-based biomarkers.
You have full access to this open access chapter, Download conference paper PDF
1 Introduction
Short axis cine (SAX) is the primary cardiac magnetic resonance imaging (cMRI) protocol for assessing cardiac function, but relies on electrocardiogram (ECG) gating and breath holds during acquisition. The protocol requires the subject to lie still, restricting it to acquisitions at rest, and to perform breath holds, which not all patients can do due to pathology or age. Free breathing non-gated real-time cine (RT) is an alternative protocol which has recently been proposed as a solution to the above problem [10, 11]. This method acquires images over multiple cardiac cycles at the same locations in the left ventricle (LV) as SAX, both eliminating the need for patient cooperation during breath holds and allowing imaging during movement and distorted ECG signals, eg. during exercise. A less demanding scanning sequence such as RT greatly expands the number of patients who can be scanned, and using RT to assess cardiac function during exercise has its own diagnostic potential [3]. Developing a methodology for improving the quality of RT imaging would therefore provide a new and effective tool for clinicians.
However the RT series suffers from poor image quality, in terms of greater noise and artefacts, compared to SAX sequences, as well as patient motion from respiration and minor body shifts. The images of this series are thus neither aligned in space nor in time as each plane starts acquisition at arbitrary points in the cardiac cycle. This significantly hinders clinical applicability, as through-plane motion of the heart results in significant variability in quantified volumes. In order to approach clinically acceptable standards, these issues need to be compensated for, and to calculate biomarkers with comparable accuracy to conventional cine, a motion corrected ECG gated cine image series should be reconstructed. Specifically images must be retained at end-expiration only to eliminate geometric and position variation, and these must be aligned so that the LV region in each image is correctly aligned in space.
In this paper we propose a workflow for accomplishing this reconstruction using convolutional neural networks (CNN) for automatic segmentation of the LV, manifold learning for estimating respiratory cycles, and an image processing pipeline which uses the information from these two sources to create a single-cycle image series. Our workflow uses automatically generated segmentations alone for determining cardiac cycle position for images. This contrasts with approaches in [1, 13] which estimate the R-R interval using the image frequency domain and then determine which images are end-diastole based on semi-automated segmentation, and other techniques including [5] which use stable ECG signals for reconstruction. Furthermore our method does not require complex k-space reconstruction techniques [4] which are not widely available in clinical practice. Analysis techniques used to assess SAX series can then be applied to this reconstructed series to calculate biomarkers.
Our novel contribution is a workflow which wholly automates the process of reconstructing a single-cycle image series from the input RT series, eliminating both the need for manual segmentation and inter-observer variability. This contrasts with previous work which involved manual input in the reconstruction pipeline [13]. The next sections will describe the steps in the workflow in detail, present results of applying the workflow to volunteer images, and then discuss the further work necessary to refine the technique.
2 Methodology
The automated workflow shown in Fig. 1 transforms a RT series spanning multiple cardiac cycles, containing images neither aligned in space nor corrected for respiration, into a single-cycle series corrected for spatial alignment and containing only end-expiration images. The input image series is a 4-dimensional image volume storing 3D image volumes capturing cardiac geometry over time (3D+t). The output image is also 4D, but contains only a single cardiac cycle, removing the respiratory variability and correctly aligning the sequence to reconstruct an image similar to the standard SAX sequence. The steps in this workflow are outlined here in brief and expanded in the following subsections:
-
1.
Data Import: Image data is imported from the Dicom files and converted to Nifti format using Eidolon [8]. This is the most convenient and reliable tool for this step although any Nifti file from other sources is acceptable.
-
2.
Estimate Respiration: Manifold learning is used to estimate the position of each image in the respiratory cycle, and images not at end-expiration are rejected.
-
3.
Segment Images: A neural network trained to segment the myocardium of the left ventricle is applied to each image, providing a myocardial mask.
-
4.
Select Segments: Each mask is assessed for correctness and quality to remove malformed segmentations outside the LV region.
-
5.
Assign Bins: All images not rejected in the previous steps are assigned to a position in the cardiac cycle. Firstly, images at end-diastole are identified by measuring the pool volumes of their myocardial masks. Other images are then assigned to bins representing phases of the cardiac cycle using the time differences between the identified end-diastole images at the start and end of that cardiac cycle.
-
6.
Reconstruct Series: A reconstructed series is created by selecting an image from each cycle position for every slice position.
-
7.
Calculate Volumes: The reconstruction is also applied to the myocardial mask associated with each image to produce a single-cycle segment series which can be used to calculate volume properties.
2.1 Deep Learning Segmentation
Our technique relies on a convolutional neural network trained to segment LV myocardial tissue in real-time images. The network architecture (Fig. 2) is based on U-Net [12, 16] using residual units [7] and is implemented in Tensorflow. Each numbered section of the encoding phase of the network is implemented using the residual unit definition on the left, where the number indicates the output number of filters. Upsampling in the decoding phase is done with a transpose convolution followed by a single set of batchnorm-prelu-conv2D operations.
Residual units use Parametric Rectified Linear Unit (PReLU) [6] for activation and are defined with 4 series of batchnorm-prelu-convolution operations. Downsampling in the encoding phase of the network is done using striding on the first convolution of the residual unit.
The training data for the network consists of RT series images from 10 volunteers segmented manually by a clinician resulting in a dataset of 3153 greyscale image/mask pairs. Figure 3 shows a selection of augmented image/mask pairs taken from the training dataset. During training, data augmentation [9, 15] is applied to the images to ensure the model does not overfit to the dataset. Specifically, a random selection of rotation, transposition, flip, shift, and zoom operations are applied to each image/mask pair of a batch. Segmenting RT imaging is particularly challenging as RT image quality is inferior to conventional gated cine. Our proposed approach and the network architecture have been specifically chosen and optimised for this task.
Figure 4 illustrates a set of segmentations for a real-time image series after malformed segmentations are removed in the selection stage. The selection process is necessary since every image in the RT series is used for inference with the segmentation network. This includes images above and below the LV which are not meant to be segmented and so are not represented in the training set. When the network attempts to segment these the results are not well-formed circular segmentations, and must be identified and filtered out. Secondly, during the respiration cycle the LV may deform or move through the image plane sufficiently to distort the image and produce poor segmentations. The expectation is that the RT image will capture enough cardiac cycles that there will be a well-segmented representative somewhere to reconstruct a single cycle.
2.2 Manifold Learning
To account for respiratory motion we select the frames at end-expiration, which is the current standard for assessment for cardiac volumes. Here we use Laplacian Eigenmaps (LE) [2] to automatically estimate the respiratory state in each slice in the image stack. LE is a manifold learning technique which embeds the data into a low-dimensional space while preserving local neighbourhood relations by finding the most significant modes of variation along the manifold on which the original data lies. When applied to a sequence of cine cardiac images, we find that this variation consistently corresponds to the respiratory cycle.
Using each image’s coordinates in the low-dimensional embedding we assign a true or false value stating whether each image is within some specified margin of the end-respiration state. MR images, rather than myocardial masks, are used at this step as regions outside the heart also contain information about the respiratory state and help in this categorisation.
2.3 Image Processing
In addition to Eidolon and Tensorflow, our workflow uses Python with the Numpy, Scipy, and Nibabel libraries for array types, scientific functions, and Nifti loading respectively. Our workflow is implemented as a Jupyter notebook which guides the user through the process of applying the operations. This is available as open source on our repository at https://github.com/ericspod/RealtimeReconWorkflow.
Select Segments. In this stage each 2D image is analysed individually to filter out bad segmentations. Due to image quality and subject motion not all images present usable cardiac geometry and thus cannot be correctly segmented. A segmentation is rejected if it does not have exactly one cavity (i.e. the LV chamber), if it contains fewer than 100 pixels, or if the surface area of a convex hull enclosing the segmentation is at least 15% larger than the area of the segmentation including pool. This ensures that only a segmentation representing a relatively smooth annulus is accepted. The value of 100 pixels for minimal size has been found in our analysis to be a rational rejection criteria for images of 256\(\,\times \,\)256 pixels, and similarly the 15% size criteria has been found to reject highly irregular segmentations.
Assign Bins. At this stage images at end-expiration have been selected using manifold learning, these are then assigned to a position in the cardiac cycle based on the segmentation areas. Each image with a segmentation is assigned the surface area of their segmentation’s pools. Images with areas larger than its non-zero neighbours are assumed to be as ED, and so an identifiable cardiac cycle is a list of contiguous images in time, which all have surface areas, from one of these maximal images up to the but not including the next. These individual cycles must be a certain length to be accepted, if so each image in the cycle is assigned a value representing their position in the cycle from 0 to 1.
The cycles thus defined will vary in length and so the values assigned to each image do not all fall easily into obvious bins. A histogram is computed for the percentages to determine what the bins should be such that no bin is empty, which then allows a bin number to be assigned to each image. Each bin represents a frame of a single cardiac cycle, where bin 0 is the ED frame. Bins contain many images, however with the filtering done for segmentation quality and respiration they should all represent the same geometry at the same moment in the cardiac cycle.
Reconstruct Series. With bins assigned to each image (excepting those rejected for poor segmentation or respiratory motion), reconstructing a short-axis like image becomes a process of selecting an image from each bin for each image plane. The number of bins has been chosen to ensure that each bin is represented by at least one image from each image plane, however typically there will be many images to choose from at each plane per bin. Our current strategy is to select, for particular plane and bin, the image whose segmentation’s pool has a surface area closest to the mean of all those in its bin. The time offset between frames is taken from the original images and if this value correctly represents the per-image time since ED this will be preserved in the final output image.
Having selected the images for each slice at each timestep, the same process is applied to the segmentations to produce a second segmentation series. The average of the image centroids from this series is calculated, then each image in the segment series and its corresponding image in the reconstructed SAX-like image are shifted in the XY plane to be centred at this position.
This produces an image and a segmentation series which accounts for patient movement during scanning which is not respiratory in origin, although aligning all images does produce non-physiological results since the left ventricle is not entirely conical but flatter on the septal side. For the purposes of various 2D assessment criteria, for example estimating ejection fraction, this does not impact results but does aid in observing wall motion abnormalities.
Calculate Volumes. Pool volumes are calculated by summing up the number of pixels in the pools of the segmentations. Volumes are then summed in the depth dimension then multiplied by the voxel volume of the original image to give a total per timestep volume.
3 Results
We scanned seven participants using the real-time protocol twice with a rest interval between scans at St. Thomas’ Hospital, London, UK, using a 1.5T Philips Ingenia MR scanner (Philips, Best, Netherlands). All participants were healthy volunteers with ages ranging from 20 to 32, 4 males and 3 females. Due to the length of the scan sequence a much larger dataset of 100 frames was acquired for each subject. These frames are significantly dissimilar even within one patient due to breathing and exercise motion. Therefore these datasets represent a larger training dataset than the number of participants would seem to imply. We did not consider clinical patient participants for this study since their morphological details will not vary significantly from healthy volunteers as observed in the used RT sequences.
Figure 5 shows an example of reconstructed CINE-like sequence and segmentations at end-diastole for all the slices. We can see that the slices are correctly aligned and segmentations allow for accurate assessment of volumes.
For a further validation, we compared the volumes obtained with the proposed reconstruction pipeline cine with a previously validated, non-gated real-time imaging protocol (non-gated RT) that uses manual selection of respiratory state [10]. Table 1 summarises the ED volume (EDV) and end-systolic volume (ESV) for the two methods.
The results obtained with the proposed automatic pipeline correlated well with the manual results from the non-gated MRI sequence. Visual inspection of this sort is standard clinical practice and thus our results are comparable to assessments considered sufficiently accurate for diagnostic purposes. From a clinical perspective, the differences between results are within expected ranges between inter-observer measurements in the reference standard [10, 13], and so are diagnostically valid.
4 Conclusion and Future Work
We have in this paper demonstrated a workflow for reconstructing a single-cycle image series from a real-time series. This is in multiple ways an extension of previous work on reconstruction of respiratory and cardiac gated cine SAX [13], whose method relied on manual input to determine the location of the heart. Furthermore, the previous method for estimating temporal and spatial motions of the images were obtained from changes in image contrast, resulting in sub-optimal alignment and blurring of the images due to inaccurate motion estimation.
Other pipelines have been proposed for cine reconstruction from ungated, free-breathing MR. However, these methods often rely on complex, self-tailored k-space acquisition and reconstruction schemes [4]. Whereas these techniques benefit from utilising raw data instead of reconstructed RT images for reconstruction of gated cine MR, such frameworks have limited clinical applicability, as they are not available in most clinical MR facilities. Our technique is available as standalone Python software or integrated into Eidolon, a free cardiac MR analysis software, and can therefore be implemented using a standard personal computer.
Due to the high acceleration factors needed for RT imaging, image quality is lower compared to conventional cine. Although this did not result in a significant impact on measurements of the most frequently obtained parameters (cardiac volumes and function), specific details and features of boundaries are less distinct and can possibly impact more comprehensive cardiac assessment. Our future research goals include investigating the use of autoencoders [14] as a method of improving image quality.
Our process of assigning images to cardiac phase bins relies the assignment of an ED image using the local maximal pool volume. This is not strictly true and so there is some approximation in how images are assigned to cardiac cycle position. This does not optimally utilise the RT information, as the independency of acquisition and cardiac frequency likely results in different offsets of images from ‘true’ ED between the different cycles acquired per slice. Exploitation of this knowledge will make it possible to create smaller (and thus more) bins and therefore increase temporal resolution of the reconstructed images.
With this work, we have shown a feasible technique for a complete automatic pipeline for the reconstruction of cardiac and respiratory gated cine MR from RT imaging and subsequent quantification of key parameters of cardiac function. When extended to include the full heart, including the basal part of the ventricle and the atria, this technique can be of great benefit in imaging of subjects that are unable to hold their breath or lie still, such as children or severely ill patients. Furthermore, as demonstrated our work allows imaging during diagnostic scans employing physical exercise.
References
van Amerom, J., et al.: Fetal cardiac cine imaging using highly accelerated dynamic MRI with retrospective motion correction and outlier rejection. Magn. Reson. Med. 79, 327–338 (2017)
Belkin, M., Niyogi, P.: Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput. 15(6), 1373–1396 (2003)
Cahalin, L.P., et al.: A meta-analysis of the prognostic significance of cardiopulmonary exercise testing in patients with heart failure. Heart Fail. Rev. 18(1), 79–94 (2013)
Feng, L., Axel, L., Chandarana, H., Block, K.T., Sodickson, D.K., Otazo, R.: XD-GRASP: Golden-angle radial MRI with reconstruction of extra motion-state dimensions using compressed sensing. Magn. Reson. Med. 75(2), 775–788 (2016)
Hansen, M., Sørensen, T., Arai, A., Kellman, P.: Retrospective reconstruction of high temporal resolution cine images from real-time MRI using iterative motion correction. Magn. Reson. Med. 68(3), 741–750 (2012)
He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. CoRR abs/1502.01852 (2015)
He, K., Zhang, X., Ren, S., Sun, J.: Identity mappings in deep residual networks. CoRR abs/1603.05027 (2016)
Kerfoot, E., et al.: Eidolon: visualization and computational framework for multi-modal biomedical data analysis. In: Zheng, G., Liao, H., Jannin, P., Cattin, P., Lee, S.-L. (eds.) MIAR 2016. LNCS, vol. 9805, pp. 425–437. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-43775-0_39
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Proceedings of the 25th International Conference on Neural Information Processing Systems, vol. 1, pp. 1097–1105. NIPS 2012, Curran Associates Inc., USA (2012)
La Gerche, A., et al.: Cardiac MRI: a new gold standard for ventricular volume quantification during high-intensity exercise. Circ. Cardiovasc. Imaging 6(2), 329–338 (2013)
Lurz, P., et al.: Feasibility and reproducibility of biventricular volumetric assessment of cardiac function during exercise using real-time radial k-t SENSE magnetic resonance imaging. J. Magn. Reson. Imaging 29(5), 1062–1070 (2009)
Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. CoRR abs/1505.04597 (2015)
Ruijsink, B.: Semi-automatic cardiac and respiratory gated MRI for cardiac assessment during exercise. In: Cardoso, M.J. (ed.) CMMI/SWITCH/RAMBO - 2017. LNCS, vol. 10555, pp. 86–95. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67564-0_9
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Parallel distributed processing: Explorations in the microstructure of cognition, vol. 1. chap. Learning Internal Representations by Error Propagation, pp. 318–362. MIT Press, Cambridge (1986)
Simard, P.Y., Steinkraus, D., Platt, J.: Best practices for convolutional neural networks applied to visual document analysis. Institute of Electrical and Electronics Engineers, Inc. August 2003
Zhang, Z., Liu, Q., Wang, Y.: Road extraction by deep residual u-net. CoRR abs/1711.10684 (2017)
Acknowledgements
This research was partly supported by the National Institute for Health Research (NIHR) Biomedical Research Centre (BRC) at Guy’s and St Thomas’ NHS Foundation Trust. Views expressed are those of the authors and not necessarily of the NHS, the NIHR, or the Dept. of Health.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Kerfoot, E., Puyol Anton, E., Ruijsink, B., Clough, J., King, A.P., Schnabel, J.A. (2018). Automated CNN-Based Reconstruction of Short-Axis Cardiac MR Sequence from Real-Time Image Data. In: Stoyanov, D., et al. Image Analysis for Moving Organ, Breast, and Thoracic Images. RAMBO BIA TIA 2018 2018 2018. Lecture Notes in Computer Science(), vol 11040. Springer, Cham. https://doi.org/10.1007/978-3-030-00946-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-00946-5_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00945-8
Online ISBN: 978-3-030-00946-5
eBook Packages: Computer ScienceComputer Science (R0)