Bayesian Deep Learning for Accelerated MR Image Reconstruction

Schlemper, Jo; Castro, Daniel C.; Bai, Wenjia; Qin, Chen; Oktay, Ozan; Duan, Jinming; Price, Anthony N.; Hajnal, Jo; Rueckert, Daniel

doi:10.1007/978-3-030-00129-2_8

Jo Schlemper¹⁶,
Daniel C. Castro¹⁶,
Wenjia Bai¹⁶,
Chen Qin¹⁶,
Ozan Oktay¹⁶,
Jinming Duan¹⁶,
Anthony N. Price¹⁷,
Jo Hajnal¹⁷ &
…
Daniel Rueckert¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11074))

Included in the following conference series:

International Workshop on Machine Learning for Medical Image Reconstruction

3939 Accesses
26 Citations

Abstract

Recently, many deep learning (DL) based MR image reconstruction methods have been proposed with promising results. However, only a handful of work has been focussing on characterising the behaviour of deep networks, such as investigating when the networks may fail to reconstruct. In this work, we explore the applicability of Bayesian DL techniques to model the uncertainty associated with DL-based reconstructions. In particular, we apply MC-dropout and heteroscedastic loss to the reconstruction networks to model epistemic and aleatoric uncertainty. We show that the proposed Bayesian methods achieve competitive performance when the test images are relatively far from the training data distribution and outperforms when the baseline method is over-parametrised. In addition, we qualitatively show that there seems to be a correlation between the magnitude of the produced uncertainty maps and the error maps, demonstrating the potential utility of the Bayesian DL methods for assessing the reliability of the reconstructed images.

You have full access to this open access chapter, Download conference paper PDF

Rethinking the Optimization Process for Self-supervised Model-Driven MRI Reconstruction

PixCUE: Joint Uncertainty Estimation and Image Reconstruction in MRI using Deep Pixel Classification

Article 04 December 2024

A generalized modeling of ill-posed inverse reconstruction of images using a novel data-driven framework

Article 06 September 2019

1 Introduction

Deep learning (DL)-based accelerated magnetic resonance (MR) image reconstruction is currently an active area of research. Many model architectures have been proposed, such as networks which learn end-to-end transformations [4], networks which “unroll” the traditional optimisation algorithm into a deep network [7, 9], optimisation algorithms incorporating deep priors [11] and, more recently, networks incorporating adversarial losses [12]. Finding the optimal network architecture remains an exciting problem. Despite there being an advance in architectural search, only marginal progress has been made in understanding the reconstruction networks’ behaviour. In [13], the expressibility of the network to achieve a perfect reconstruction is outlined using the connection between convolutional neural networks (CNN) and convolution framelets. The authors of [6] empirically assessed the generalisability of the variational inference network, and found that the performance of the network is sensitive to the signal-to-noise ratio (SNR) of data. Nevertheless, no theory exists which can explain the worst-case behaviour of these networks. In [1], it is shown that methods with adversarial losses can bias the reconstruction with the risk of hallucination. Even though DL methods are shown effective, having a good grasp of how they produce error is crucial for the reliable deployment of these methods in clinical settings.

While bridging the gap in our theoretical understanding of DL-based reconstruction remains challenging, the literature from Bayesian deep learning suggests that the uncertainty associated with network outputs can be directly modelled using practical regularisation techniques [2]. Namely, these techniques are MC-dropout and heteroscedastic loss [5], which capture model uncertainty and data uncertainty respectively. Although such techniques have been applied to MR image quality transfer/super-resolution (SR) tasks [10], it is yet to be investigated for the general MR image reconstruction setting. In this work, we apply them to two network architectures, UNET [4] and a deep cascade of CNNs (DC-CNN) [9]. We show that the Bayesian DL methods are able to approximately characterise the confidence associated with the generated reconstructions. However, we also point out that the proposed formulation seems to be overly simplistic to model the “true” uncertainty associated with MR reconstruction problem in general. More sophisticated approaches may be necessary before such uncertainty maps can be leveraged in practical scenarios.

2 Methods

Problem Formulation: Let $\mathbf {x}\in \mathbb {C}^n$ be a fully-sampled image, $\mathbf {y}\in \mathbb {C}^m$ be the undersampled data obtained as $\mathbf {y}= \mathcal {F}_u \mathbf {x}+ \epsilon $, where $\mathcal {F}_u$ is an undersampling Fourier operator and $\epsilon \sim \mathcal {N}(0, \sigma ^2 \mathbf {I})$. The goal is to learn the inversion $p(\mathbf {x}|\mathbf {y})$ or $p(\mathbf {x}| \mathbf {x}_u)$ where $\mathbf {x}_u= \mathcal {F}^H_u \mathbf {y}$ is the zero-filled reconstruction, which is aliased. This is typically approached by a maximum a posteriori (MAP) estimate $\arg \max _{\mathbf {x}} p(\mathbf {x}|\mathbf {y}) = \arg \min _{\mathbf {x}} - \log p(\mathbf {y}|\mathbf {x}) - \log p(\mathbf {x})$ because this can be equivalently solved as a minimisation problem, wherein the likelihood and the prior terms correspond to data fidelity and regularisation terms, respectively. For example, compressed sensing can be seen as a MAP inference with sparsity-inducing prior. Many deep learning algorithms can be seen as an approximation to such MAP inference [3, 7, 9], where they learn an inversion function $f^{\mathbf {w}}(\mathbf {x}_u) \approx \mathbf {x}$, and the network parameters $\mathbf {w}$ are learnt from the dataset $\mathcal {D} = (\mathbf {Y}, \mathbf {X}) = (\{ \mathbf {y}_1, \dots , \mathbf {y}_n \},\{ \mathbf {x}_1, \dots , \mathbf {x}_n \})$. The problem with MAP inference is that it provides only a point estimate. In the case of compressed sensing, there is a theoretical framework that relates the number of measurements, sparsity level and reconstruction error. For deep learning, no such theoretical properties exist yet and it is unknown when the network fails to reconstruct an image. Therefore, it is desirable to model the distribution $p(\mathbf {x}| \mathbf {y})$ instead, which can provide the variance associated with the output.

Bayesian Deep Learning: In the Bayesian formulation, given a new undersampled image $\mathbf {x}_u$ and dataset $\mathcal {D}$, a predictive distribution for the reconstructed image $\mathbf {x}$ is obtained by $p(\mathbf {x}| \mathbf {x}_u, \mathcal {D}) = \int p(\mathbf {x}| \mathbf {x}_u, \mathbf {w}) p(\mathbf {w}| \mathcal {D}) \,\mathrm {d}\mathbf {w}$. Note that in practice the posterior $p(\mathbf {w}| \mathcal {D})$ is intractable and often approximated using a distribution $q(\mathbf {w})$ (variational inference). In addition, the above predictive distribution is often estimated via Monte Carlo integration unless an analytical solution exists.

There are two types of uncertainty that can be identified in general. The first kind is called aleatoric (data) uncertainty: this is irreducible uncertainty observed in data. For MR image reconstruction problem, besides measurement noise, there is an inherently high level of ambiguity whether a pixel value represents an aliasing pattern, some anatomy or a texture. Whenever the network encounters unseen pathological examples, the model should exhibit higher level of uncertainty for such region in the reconstruction. The second kind is called epistemic (model) uncertainty: given dataset $\mathcal {D}$, there are many plausible network parameters $\mathbf {w}$ that can reconstruct the data well. This can be reduced by increasing the size of $\mathcal {D}$, however, in medical imaging domain, it is often difficult to collect large training data and hence it is increasingly important to account for the variability in network output caused by this uncertainty.

The two types of uncertainty can be modelled by incorporating heteroscedastic loss and MC-dropout respectively. Here we only summarise the method, however, the detailed derivation can be found in [2, 5]. Firstly, we set our likelihood function to be given by $p(\mathbf {x}|\mathbf {x}_u, \mathbf {w}) = \mathcal {N}(\mathbf {x}| f^{\mathbf {w}}(\mathbf {x}_u), g^{\mathbf {w}}(\mathbf {x}_u))$, where $f^{\mathbf {w}}(\mathbf {x}_u)$ models the mean prediction and $g^{\mathbf {w}}(\mathbf {x}_u)$ accounts for the uncertainty found in the input to estimate the covariance in the prediction. For simplicity, the covariance matrix is assumed to be diagonal (i.e. we only model the pixel-wise variance). The two networks are trained by minimising the heteroscedastic loss:

$$\begin{aligned} \mathcal {L}_{\mathrm {Het.}}(\mathbf {w}) = \frac{1}{N|\mathcal {D}|}\sum _{(\mathbf {x}_u, \mathbf {x}) \in \mathcal {D}} \sum _{i=1}^N \frac{1}{2g^{\mathbf {w}}_i(\mathbf {x}_u)} \Vert \mathbf {x}_i - f_i^{\mathbf {w}}(\mathbf {x}_u) \Vert ^2 + \frac{1}{2} \log g_i^{\mathbf {w}} (\mathbf {x}_u), \end{aligned}$$

(1)

i.e. the pixel-wise error is weighted by the predicted inverse pixel variance. Epistemic uncertainty can be modelled using MC-dropout: it simply applies dropout to the network activation maps. At test time, the predictive mean is given by $\mathbb {E} [\mathbf {x}] \approx \frac{1}{T} \sum _{t=1}^T f^{\mathbf {w}_t}(\mathbf {x}_u)$, where $\mathbf {w}_t$ denotes the network configuration after dropout has been applied. The predictive variance is given by $\mathbb {V}[\mathbf {x}] \approx \frac{1}{T} \sum _{t=1}^T g^{{\mathbf {w}}_t}(\mathbf {x}_u) + \frac{1}{T} \sum _{t=1}^T (f^{\mathbf {w}_t}(\mathbf {x}_u))^2 - \big ( \frac{1}{T} \sum _{t=1}^T f^{\mathbf {w}_t}(\mathbf {x}_u) \big )^2$, where the first and the last two terms correspond to aleatoric and epistemic uncertainty respectively. In addition, the variance of each complex-valued pixel is given by the sum of real and imaginary components: $\mathbb {V}[\mathbf {z}] = \mathbb {V}[\mathfrak {R}(\mathbf {z})] + \mathbb {V}[\mathfrak {I}(\mathbf {z})]$.

Network Architectures. We consider UNET [4] and DC-CNN [9] as the base architectures $f^{\mathbf {w}}(\mathbf {x}_u)$. Note that the design of $g^{\mathbf {w}}(\mathbf {x}_u)$ is flexible. In particular, one can independently parametrise f and g, or consider one network with multiple heads. In the former case, $g^{\mathbf {w}}(\mathbf {x}_u)$ models intrinsic data uncertainty [10], whereas in the latter case, the uncertainty is correlated with the mean prediction. For DC-CNN, we consider both variants: DC-CNN1 aggregates the penultimate feature maps from each sub-network, which is fed into a 5-layer variance network. For DC-CNN2, an independent 5-layer variance network is trained directly from the undersampled image. For UNET, $f^{\mathbf {w}}(\mathbf {x}_u)$ and $g^{\mathbf {w}}(\mathbf {x}_u)$ share the same encoder but have independent decoders. See Fig. 1 for more details.

3 Experiments and Results

Dataset: Two datasets were considered for the experiments. Dataset A consists of 5000 short-axis cardiac cine MR images from the UK Biobank study [8], which was acquired using bSSFP sequence, matrix size 208 $\times $ 187, 50 frames and a pixel resolution of 1.8 $\times $ 1.8 $\times $ 10.0 mm$^3$. Since only the magnitude images were available, we simulated the phase components using slowly varying sinusoidal waves. Dataset B consists of 10 fully sampled short-axis cardiac cine MR scans acquired at St. Thomas Hospital, UK, using bSSFP sequence with 32-channels, matrix size 192 $\times $ 190, 30 frames, 320 $\times $ 320 mm FOV and 10 mm slice thickness. The multi-coil data was recombined into a single complex-valued image using SENSE, which was then treated as the ground truth image. Both datasets were cropped to 192 $\times $ 192.

Experiment Setup: In this work, we investigate the following questions: (1) How do the Bayesian networks perform compared to the standard networks? (2) How do the generated uncertainty maps look like for (a) same dataset, same undersampling scheme, different acceleration factors, (b) same dataset, different undersampling scheme, and (c) different dataset? To answer these questions, the following experimental setting was considered: Dataset A was split into 4000 training subjects and 1000 test subjects. For training, we used 1D Cartesian undersampling, where each line was sampled according to a zero-mean Gaussian distribution. The undersampling masks were generated on-the-fly as we trained, and the acceleration factor was chosen arbitrarily from $n_{\mathrm {acc}} \in [1, 5]$. Note that the networks perform better when they are fine-tuned on a fixed acceleration factor alone. However, for this work, this setup sufficed as we were only interested in the relative performance of the Bayesian formulations. For testing, we used 1000 test subjects from Dataset A and all subjects from Dataset B. In addition, three different undersampling patterns were considered: 1D Cartesian, radial and low-resolution undersampling (SR). We used golden-angle sampling for radial undersampling and for SR, the lowest frequencies were acquired until the desired acceleration factor is met. The results were evaluated using peak signal-to-noise ratio (PSNR).

Model Parameters: For each network proposed above, we considered the following variants for an ablation study: (1) plain network, (2) MC-dropout only (+D), (3) with heteroscedastic loss only (+H), and (4) both (+D+H). For (1) and (2), we used the usual mean squared error (MSE) loss instead. The element-wise dropout with $p = 0.2$ was applied to every feature map except for the last layers of UNET and DC-CNN sub-networks. For training, we used Adam with $\alpha =10^{-4}, \beta _1=0.9,\beta _2=0.999$, where $\alpha $ was reduced by a factor of 0.1 every 100 epochs. Each network was trained for 300 epochs, using He initialisation, weight decay of $\lambda =10^{-6}$ and a mini-batch size 8. For data augmentation, affine transformations were applied on-the-fly, where parameters were sampled from $360 ^\circ $ rotation, $\pm 20$ pixel shift and scaling factor $s \in [0.9, 1.3]$. For MC-dropout, we used $T=20$ samples as we empirically found that the result rapidly plateaued beyond that. We used PyTorch for our implementations.

Results: In Fig. 2, the quantitative results are summarised for each dataset and undersampling scheme for a range of acceleration factors (AF). For dataset A with Cartesian undersampling (the closest to the training distribution), the Bayesian methods had poorer performance compared to the baseline networks. However, interestingly, the performance gap was tightened when the experiment was repeated with the different undersampling schemes or dataset B. For DC-CNN, the plain network achieved the highest PSNR and DC-CNN1+D+H performed the poorest. This might be either because the variance estimate may be too noisy during the training due to dropout, or being stuck in a suboptimal local minimum as a result of f and g competing. For UNET, the Bayesian models consistently outperformed the baseline for dataset B. This suggests that the proposed Bayesian formulation could alleviate the network overfitting a specific distribution when the model is over-parametrised.

The generated epistemic and aleatoric uncertainty maps for UNET+D+H, DC-CNN1+D+H and DC-CNN2+D+H are displayed in Fig. 3. We see that, in terms of scale, there is a rough correlation between the error map and the epistemic uncertainty map. However, when each uncertainty map is inspected in detail, they do not necessarily highlight the regions with highest error. For aleatoric uncertainty, it tends to highlight image borders, but they do not consistently correspond to the most aliased regions of the undersampled input. In Fig. 4, we compare the epistemic and aleatoric uncertainty maps generated by DC-CNN1+D+H and DC-CNN2+D+H from dataset A with Cartesian undersampling for different AFs. We can see that for DC-CNN1, the uncertainty level increased as AF was increased. For DC-CNN2, we can observe a higher level of uncertainty across the image, where the edges are highlighted more dominantly. We note that the observations made here are consistent with literature [10].

4 Discussion and Conclusion

In this work, we evaluated MC-dropout and heteroscedastic loss for the MR image reconstruction problem. We observed that the Bayesian methods performed competitively when the data is further away from the training distribution and the generated epistemic and aleatoric uncertainty maps showed a correlation with the error maps. However, we note that the current form of modelling posed several limitations. Firstly, the characteristics of aleatoric uncertainty are heavily dependant on whether f and g are correlated or not, an architectural decision that has to be made by the users based on task-oriented goals. Secondly, we also noticed that between the generated error and the uncertainty maps, there were noticeable discrepancies at fine-scale. For epistemic uncertainty map, we speculate that this may be because MC-dropout is a simple technique and cannot capture the full model uncertainty for the reconstruction networks. For aleatoric uncertainty map, it is presumably because the proposed methods model only pixel-wise uncertainty (i.e. only the diagonal entries of the covariance matrix). We hypothesize that such simplification is a poor approximation for modelling the variance in MR reconstruction, as aliasing caused by random undersampling is distributed across the entire image. A better modelling of such covariance matrix is therefore likely to improve the results. Finally, albeit parallelisable, obtaining epistemic uncertainty requires T forward passes, which may be problematic for real-time applications. Nevertheless, we believe that Bayesian deep learning has a great scope for improvement and is a crucial step towards better characterisation of deep reconstruction networks.

References

Cohen, J.P., Luck, M., Honari, S.: Distribution matching losses can hallucinate features in medical image translation. arXiv preprint arXiv:1805.08841 (2018)
Gal, Y.: Uncertainty in deep learning. University of Cambridge (2016)
Google Scholar
Hammernik, K., et al.: Learning a variational network for reconstruction of accelerated MRI data. Magn. Reson. Med. 79, 3055–3071 (2017)
Article Google Scholar
Han, Y., Yoo, J., Kim, H.H., Shin, H.J., Sung, K., Ye, J.C.: Deep learning with domain adaptation for accelerated projection-reconstruction MR. Magn. Reson. Med. 80(3), 1189–1205 (2018)
Article Google Scholar
Kendall, A., Gal, Y.: What uncertainties do we need in Bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, pp. 5580–5590 (2017)
Google Scholar
Knoll, F., Hammernik, K., Kobler, E., Pock, T., Recht, M.P., Sodickson, D.K.: Assessment of the generalization of learned image reconstruction and the potential for transfer learning. Magn. Reson. Med. (2018)
Google Scholar
Mardani, M., et al.: Deep generative adversarial networks for compressed sensing automates MRI. arXiv preprint arXiv:1706.00051 (2017)
Petersen, S.E., et al.: UK Biobank’s cardiovascular magnetic resonance protocol. J. Cardiovasc. Magn. Reson. 18(1), 8 (2016). Feb
Article Google Scholar
Schlemper, J., Caballero, J., Hajnal, J.V., Price, A., Rueckert, D.: A deep cascade of convolutional neural networks for dynamic MR image reconstruction. IEEE Trans. Med. Imaging 37 (2017)
Google Scholar
Tanno, R., Ghosh, A., Grussu, F., Kaden, E., Criminisi, A., Alexander, D.C.: Bayesian image quality transfer. In: Ourselin, S., Joskowicz, L., Sabuncu, M.R., Unal, G., Wells, W. (eds.) MICCAI 2016 Part II. LNCS, vol. 9901, pp. 265–273. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46723-8_31
Chapter Google Scholar
Tezcan, K.C., Baumgartner, C.F., Konukoglu, E.: MR image reconstruction using the learned data distribution as prior. CoRR abs/1711.11386 (2017). http://arxiv.org/abs/1711.11386
Yang, G., et al.: DAGAN: deep de-aliasing generative adversarial networks for fast compressed sensing MRI reconstruction. IEEE Trans. Med. Imaging 37(6), 1310–1321 (2018)
Article Google Scholar
Ye, J.C., Han, Y., Cha, E.: Deep convolutional framelets: a general deep learning framework for inverse problems. SIAM J. Imaging Sci. 11(2), 991–1048 (2018)
Article MathSciNet Google Scholar

Download references

Acknowledgements

Jo Schlemper is partially funded by EPSRC Grant (EP/P001009/1).

Author information

Authors and Affiliations

Biomedical Image Analysis Group, Imperial College London, London, UK
Jo Schlemper, Daniel C. Castro, Wenjia Bai, Chen Qin, Ozan Oktay, Jinming Duan & Daniel Rueckert
Imaging and Biomedical Engineering Clinical Academic Group, King’s College London, London, UK
Anthony N. Price & Jo Hajnal

Authors

Jo Schlemper
View author publications
You can also search for this author in PubMed Google Scholar
Daniel C. Castro
View author publications
You can also search for this author in PubMed Google Scholar
Wenjia Bai
View author publications
You can also search for this author in PubMed Google Scholar
Chen Qin
View author publications
You can also search for this author in PubMed Google Scholar
Ozan Oktay
View author publications
You can also search for this author in PubMed Google Scholar
Jinming Duan
View author publications
You can also search for this author in PubMed Google Scholar
Anthony N. Price
View author publications
You can also search for this author in PubMed Google Scholar
Jo Hajnal
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Rueckert
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jo Schlemper .

Editor information

Editors and Affiliations

New York University, New York, NY, USA
Florian Knoll
University of Erlangen-Nuremberg, Erlangen, Germany
Andreas Maier
Imperial College London, London, UK
Daniel Rueckert

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schlemper, J. et al. (2018). Bayesian Deep Learning for Accelerated MR Image Reconstruction. In: Knoll, F., Maier, A., Rueckert, D. (eds) Machine Learning for Medical Image Reconstruction. MLMIR 2018. Lecture Notes in Computer Science(), vol 11074. Springer, Cham. https://doi.org/10.1007/978-3-030-00129-2_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-00129-2_8
Published: 12 September 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00128-5
Online ISBN: 978-3-030-00129-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics