Morphology-Preserving Autoregressive 3D Generative Modelling of the Brain

Tudosiu, Petru-Daniel; Pinaya, Walter Hugo Lopez; Graham, Mark S.; Borges, Pedro; Fernandez, Virginia; Yang, Dai; Appleyard, Jeremy; Novati, Guido; Mehra, Disha; Vella, Mike; Nachev, Parashkev; Ourselin, Sebastien; Cardoso, Jorge

doi:10.1007/978-3-031-16980-9_7

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13570))

Included in the following conference series:

International Workshop on Simulation and Synthesis in Medical Imaging

1131 Accesses
7 Citations
1 Altmetric

Abstract

Human anatomy, morphology, and associated diseases can be studied using medical imaging data. However, access to medical imaging data is restricted by governance and privacy concerns, data ownership, and the cost of acquisition, thus limiting our ability to understand the human body. A possible solution to this issue is the creation of a model able to learn and then generate synthetic images of the human body conditioned on specific characteristics of relevance (e.g., age, sex, and disease status). Deep generative models, in the form of neural networks, have been recently used to create synthetic 2D images of natural scenes. Still, the ability to produce high-resolution 3D volumetric imaging data with correct anatomical morphology has been hampered by data scarcity and algorithmic and computational limitations. This work proposes a generative model that can be scaled to produce anatomically correct, high-resolution, and realistic images of the human brain, with the necessary quality to allow further downstream analyses. The ability to generate a potentially unlimited amount of data not only enables large-scale studies of human anatomy and pathology without jeopardizing patient privacy, but also significantly advances research in the field of anomaly detection, modality synthesis, learning under limited data, and fair and ethical AI. Code and trained models are available at: https://github.com/AmigoLab/SynthAnatomy.

G. Novati and M. Vella—Work done while at NVIDIA.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 44.99; Price excludes VAT (USA)

Softcover Book: USD 59.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Realistic morphology-preserving generative modelling of the brain

Article Open access 15 July 2024

Brain Imaging Generation with Latent Diffusion Models

Evaluating normative representation learning in generative AI for robust anomaly detection in brain imaging

Article Open access 13 February 2025

Notes

1.
Implementation used: https://github.com/lucidrains/performer-pytorch.

References

Sudlow, C., et al.: UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 12(3) (2015)
Google Scholar
Clifford, R., Jack Jr., et al.: The Alzheimer’s disease neuroimaging initiative (ADNI): MRI methods. J. Magn. Reson. Imaging 27(4), 685–691 (2008)
Google Scholar
Simpson, A.L., et al.: A large annotated medical image dataset for the development and evaluation of segmentation algorithms. arXiv preprint arXiv:1902.09063 (2019)
Chong, C.K., Ho, E.T.W.: Synthesis of 3D MRI brain images with shape and texture generative adversarial deep neural networks. IEEE Access 9, 64747–64760 (2021)
Article Google Scholar
Lin, W., et al.: Bidirectional mapping of brain MRI and pet with 3D reversible GAN for the diagnosis of Alzheimer’s disease. Front. Neurosci. 15, 357 (2021)
Google Scholar
Rusak, F., et al.: 3D Brain MRI GAN-based synthesis conditioned on partial volume maps. In: Burgos, N., Svoboda, D., Wolterink, J.M., Zhao, C. (eds.) SASHIMI 2020. LNCS, vol. 12417, pp. 11–20. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59520-3_2
Segato, A., et al.: Data augmentation of 3d brain environment using deep convolutional refined auto-encoding alpha GAN. IEEE Trans. Med. Robot. Bion. 3(1), 269–272 (2020)
Google Scholar
Kwon, G., Han, C., Kim, D.: Generation of 3D brain MRI using auto-encoding generative adversarial networks. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11766, pp. 118–126. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32248-9_14
Chapter Google Scholar
Xing, S., et al.: Cycle consistent embedding of 3D brains with auto-encoding generative adversarial networks. In: Medical Imaging with Deep Learning (2021)
Google Scholar
Sun, L., et al.: Hierarchical amortized training for memory-efficient high resolution 3D GAN. arXiv preprint arXiv:2008.01910 (2020)
Wang, Z., et al.: Multiscale structural similarity for image quality assessment. In: The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, vol. 2, pp. 1398–1402. IEEE (2003)
Google Scholar
Heusel, M., et al.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Gretton, A., et al.: A kernel two-sample test. J. Mach. Learn. Res. 13(1), 723–773 (2012)
Google Scholar
Razavi, A., et al.: Generating diverse high-fidelity images with VQ-VAE-2. In: Proceedings of the 33rd International Conference on Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Esser, P., et al.: Taming transformers for high-resolution image synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12873–12883 (2021)
Google Scholar
Yu, J., et al.: Vector-quantized image modeling with improved VQGAN. arXiv preprint arXiv:2110.04627 (2021)
Van Den Oord, A., Vinyals, O., et al.: Neural discrete representation learning. In: Proceedings of the 31st International Conference on Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Krzysztof, C., et al.: Rethinking attention with performers. In: Proceedings of ICLR (2021)
Google Scholar
Jordon, J., et al.: Synthetic data-what, why and how? arXiv preprint arXiv:2205.03257 (2022)
Esteban, C., et al.: Real-valued (medical) time series generation with recurrent conditional GANs. arXiv preprint arXiv:1706.02633 (2017)
Ashburner, J., Friston, K.J.: Voxel-based morphometry-the methods. Neuroimage 11(6), 805–821 (2000)
Google Scholar
Cardoso, M.J., et al.: Geodesic information flows: spatially-variant graphs and their application to segmentation and fusion. IEEE Trans. Med. Imaging 34(9):1976–1988 (2015)
Google Scholar
Tay, V., et al.: Long range arena: a benchmark for efficient transformers. In: International Conference on Learning Representations (2020)
Google Scholar
Graham, M.S., et al.: Transformer-based out-of-distribution detection for clinically safe segmentation. In: Conference on Medical Imaging with Deep Learning (2022)
Google Scholar
Dhariwal, P., et al.: Jukebox: a generative model for music. arXiv preprint arXiv:2005.00341 (2020)
Zhang, R., et al.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Google Scholar
Isola, P., et al.: Image-to-image translation with conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1125–1134 (2017)
Google Scholar
Mao, X., et al.: Least squares generative adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2794–2802 (2017)
Google Scholar
Tudosiu, P.-D., et al.: Neuromorphologicaly-preserving volumetric data encoding using VQ-VAE. arXiv preprint arXiv:2002.05692 (2020)
Gulrajani, I., et al.: Improved training of Wasserstein GANs. In: Conference on Advances in Neural Information Processing Systems, vol. 30 (2017)
Google Scholar
Ridgway, G.R., et al.: The problem of low variance voxels in statistical parametric mapping; a new hat avoids a ‘haircut’. Neuroimage 59(3), 2131–2141 (2012)
Google Scholar
Pinaya, W.H.L., et al.: Unsupervised brain anomaly detection and segmentation with transformers. In: Conference on Medical Imaging with Deep Learning, pp. 596–617. PMLR (2021)
Google Scholar
Bachlechner, T., et al.: ReZero is all you need: Fast convergence at large depth. In: Uncertainty in Artificial Intelligence, pp. 1352–1361. PMLR (2021)
Google Scholar
Ashburner, J., et al.: SPM12 Manual. Wellcome Trust Centre for Neuroimaging, London (2014)
Google Scholar

Download references

Acknowledgements

WHLP, MG, PB, MJC and PN are supported by Wellcome [WT213038/Z/18/Z]. PTD is supported by the EPSRC Research Council, part of the EPSRC DTP [EP/R513064/1]. FV is supported by Wellcome/ EPSRC Centre for Medical Engineering [WT203148/Z/16/Z], Wellcome Flagship Programme [WT213038/Z/18/Z], The London AI Centre for Value-based Healthcare and GE Healthcare. PB is also supported by Wellcome Flagship Programme [WT213038/Z/18/Z] and Wellcome EPSRC CME [WT203148/Z/16/Z]. PN is also supported by the UCLH NIHR Biomedical Research Centre. The models in this work were trained on NVIDIA Cambridge-1, the UK’s largest supercomputer, aimed at accelerating digital biology.

Author information

Authors and Affiliations

Department of Biomedical Engineering, School of Biomedical Engineering & Imaging Sciences, King’s College London, London, UK
Petru-Daniel Tudosiu, Walter Hugo Lopez Pinaya, Mark S. Graham, Pedro Borges, Virginia Fernandez, Sebastien Ourselin & Jorge Cardoso
NVIDIA, Santa Clara, USA
Dai Yang, Jeremy Appleyard & Disha Mehra
DeepMind, London, UK
Guido Novati
Oxford Nanopore Technologies, Gosling Building, Oxford Science Park, Edmund Halley Road, Littlemore, Oxford, OX4 4DQ, UK
Mike Vella
Queen Square Institute of Neurology, University College London, London, UK
Parashkev Nachev

Authors

Petru-Daniel Tudosiu
View author publications
You can also search for this author in PubMed Google Scholar
Walter Hugo Lopez Pinaya
View author publications
You can also search for this author in PubMed Google Scholar
Mark S. Graham
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Borges
View author publications
You can also search for this author in PubMed Google Scholar
Virginia Fernandez
View author publications
You can also search for this author in PubMed Google Scholar
Dai Yang
View author publications
You can also search for this author in PubMed Google Scholar
Jeremy Appleyard
View author publications
You can also search for this author in PubMed Google Scholar
Guido Novati
View author publications
You can also search for this author in PubMed Google Scholar
Disha Mehra
View author publications
You can also search for this author in PubMed Google Scholar
Mike Vella
View author publications
You can also search for this author in PubMed Google Scholar
Parashkev Nachev
View author publications
You can also search for this author in PubMed Google Scholar
Sebastien Ourselin
View author publications
You can also search for this author in PubMed Google Scholar
Jorge Cardoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petru-Daniel Tudosiu .

Editor information

Editors and Affiliations

NVIDIA, Santa Clara, CA, USA
Can Zhao
Masaryk University, Brno, Czech Republic
David Svoboda
University of Twente, Enschede, The Netherlands
Jelmer M. Wolterink
Universidad de Los Andes, Bogotá, Colombia
Maria Escobar

A 6 Appendix

1.1 A.1 6.1 VQ-VAEs

The VQ-VAE model has a similar architecture with [33] but in 3D. The encoder uses strided convolutions with stride 2 and kernel size 4. There are four downsamplings in this VQ-VAE, giving the downsampling factor $f=2^4$. After the downsampling layers, there are three residuals blocks ($3\times 3\times 3$ Conv, ReLU, 1$\,\times \,$1$\,\times \,$1 Conv, ReLU). The decoder mirrors the encoder and uses transposed convolutions with stride 2 and kernel size 4. All convolution layers have 256 kernels. The $\beta $ in Eq. 1 is 0.25 and the $\gamma $ in Eq. 2 is 0.5. The codebook size was 2048 while each element’s size was 32.

1.2 B.2 6.2 Transformers

Performer’s^{Footnote 1} [19] has $L=24$ layers, $d=256$ embedding size, 16 multi-head attention modules (8 are local attention heads with window size of 420), and ReZero gating [34]. Before the raster style ordering input was RAS+ canonical voxel representation oriented.

1.3 C.3 6.3 Losses

VQ-VAE’s pixel-space loss weight is 1.0, perceptual loss’ weight is 0.001, frequency loss’ weight is 1.0. The LPIPS uses AlexNet. Adam has been used as optimizer with an exponential decay of 0.99999. VQ-VAE’s learning rate was 0.000165, discriminator’s learning rate was 0.00005 and Performer’s CrossEntropy learning rate was 0.001.

1.4 D.4 6.4 Datasets

All datasets have been split into training and testing sub-sets. The VQ-VAE UKB sub-sets had 31740 and 3970 subjects respectively, while VQ-VAE ADNI had 648 and 82. All datasets have been first processed with a rigid body registration such that they roughly fit the same field of view. Afterwards, all samples are passed through the following transformations before being fed into the VQ-VAE during training: first, they are being normalized to [0, 1], then tightly spatially cropped resulting in an image of size (160, 224, 160), random affine (rotation range 0.04, translation range 2, scale range 0.05), random contrast adjustment (gamma [0.99, 1.01]), random intensity shift (offsets [0.0, 0.05]), random Gaussian noise (mean 0.0, standard deviation 0.02), and finally, the images were thresholded to be in the range [0, 1.0]. For the Transformer, the UKB and ADNI datasets were split into sub-populations. UKB was split into small ventricles (6388 and 108), big ventricles (6321 and 156), young (6633 and 113), old (5137 and 106), while ADNI was split into cognitively normal (118 and 29) and Alzheimer’s disease (151 and 36). For the Transformer training, each ADNI sample has been augmented 100 times and each augmentation’s index-based representation was used for training it.

1.5 E.5 6.5 VBM Analysis

For the Voxel-Based Morphometry (VBM), Statistical Parametric Mapping (SPM) [35] package version 12.7486 was used with MATLAB R2019a. Before running the statistical tests, the images must first undergo unified segmentation where they were spatially normalized to a common template and simultaneously segmented into the Gray Matter (GM), White Matter (WM), and Cerebrospinal fluid (CSF) tissue segments based on prior probability maps and voxel intensities. The unified segmentation was done with the default parameters: Bias Regularisation (light regularisation 0.001), Bias FWHM (60 mm cutoff), MRF Parameter (1), Clean Up (Light Clean), Warping Regularisation ([0, 0.001, 0.5, 0.05, 0.2]), Affine Regularisation (ICBM space template - European brains), Smoothness (0), Sampling Distance (3). As per standard practice when using VBM, the group-aligned segmentations were modulated to preserve tissue volume, and a smoothing kernel was applied to the modulated tissue compartments to make the data conform to the Gaussian field model that underlines VBM and to increase the sensitivity to detect structural changes. The smoothing was also done with the default parameters with FWHM ([8, 8, 8]). For the VBM analysis, a Two-sample t-test Design was used, with the following parameters: Independence (Yes), Variance (Unequal), Grand mean scaling (No) and ANCOVA (No). No covariates, masking or global normalisation have been used.

Appendix F - Additional Samples

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tudosiu, PD. et al. (2022). Morphology-Preserving Autoregressive 3D Generative Modelling of the Brain. In: Zhao, C., Svoboda, D., Wolterink, J.M., Escobar, M. (eds) Simulation and Synthesis in Medical Imaging. SASHIMI 2022. Lecture Notes in Computer Science, vol 13570. Springer, Cham. https://doi.org/10.1007/978-3-031-16980-9_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-16980-9_7
Published: 21 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16979-3
Online ISBN: 978-3-031-16980-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Morphology-Preserving Autoregressive 3D Generative Modelling of the Brain

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Realistic morphology-preserving generative modelling of the brain

Brain Imaging Generation with Latent Diffusion Models

Evaluating normative representation learning in generative AI for robust anomaly detection in brain imaging

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A 6 Appendix

1.1 A.1 6.1 VQ-VAEs

1.2 B.2 6.2 Transformers

1.3 C.3 6.3 Losses

1.4 D.4 6.4 Datasets

1.5 E.5 6.5 VBM Analysis

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Societies and partnerships

Subscribe and save

Buy Now

Navigation

Morphology-Preserving Autoregressive 3D Generative Modelling of the Brain

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Realistic morphology-preserving generative modelling of the brain

Brain Imaging Generation with Latent Diffusion Models

Evaluating normative representation learning in generative AI for robust anomaly detection in brain imaging

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

A 6 Appendix

A 6 Appendix

1.1 A.1 6.1 VQ-VAEs

1.2 B.2 6.2 Transformers

1.3 C.3 6.3 Losses

1.4 D.4 6.4 Datasets

1.5 E.5 6.5 VBM Analysis

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships

Search

Navigation