Style transfer in conditional GANs for cross-modality synthesis of brain magnetic resonance images

doi:10.1016/j.compbiomed.2022.105928

Computers in Biology and Medicine

Volume 148, September 2022, 105928

https://doi.org/10.1016/j.compbiomed.2022.105928 Get rights and content

Highlights

•
Style transfer technique is introduced into the conditional GAN architecture to address the issue of cross-modality MR image synthesis.
•
A conditional GAN model with hierarchical feature mapping and fusion (ST-cGAN) is proposed to obtain synthetic image with enhanced style.
•
Per-pixel random noise is added to different scales of the proposed generator network to train a robust generator which is insensitive to noise.
•
The experimental results confirm the effectiveness of ST-cGAN from different perspectives of image quality assessment.

Abstract

Magnetic resonance imaging (MRI) has become one of the most standardized and widely used neuroimaging protocols in the detection and diagnosis of neurodegenerative diseases. In clinical scenarios, multi-modality MR images can provide more comprehensive information than single modality images. However, high-quality multi-modality MR images can be difficult to obtain in the actual diagnostic process due to various uncertainties. Efficient methods of modality complement and synthesis have aroused increasing attention in the research community. In this article, style transfer is introduced into conditional generative adversarial networks (cGAN) architecture. A cGAN model with hierarchical feature mapping and fusion (ST-cGAN) is proposed to address the cross-modality synthesis of MR images. In order to surmount the sole focus on the pixel-wise similarity as most cGAN-based methods do, the proposed ST-cGAN takes advantage of the style information and applies it to the synthetic image’s content structure. Taking images of two modalities as conditional input, ST-cGAN extracts different levels of style features and integrates them with the content features to form the style-enhanced synthetic image. Furthermore, the proposed model is made robust to random noise by adding noise input to the generator. A comprehensive analysis is performed by comparing the proposed ST-cGAN with other state-of-the-art baselines based on four representative evaluation metrics. The experimental results on the IXI (Information eXtraction from Images) dataset verify the validity of the ST-cGAN from different evaluation perspectives.

Graphical abstract

Introduction

Magnetic resonance imaging (MRI) provides an intuitive method for studying the structure and function of the human brain, and has become one of the most standardized and widely used neuroimaging methods in the detection and diagnosis of neurodegenerative disease. It is a non-invasive and radiation-free imaging technique used to generate high-resolution 3D or 4D images of different brain tissues. Different pulse sequences and parameters in the scanning process of imaging equipment can generate various tissue contrast images. These multiple-modality MR images can display valuable information on the tissue structure and function from different aspects. For example, T1-weighted (T1) images, characterized by short repetition time (TR) and short echo time (TE), are better suited for observing the anatomical structures and distinguishing between the gray matter (GM) and white matter (WM). T2-weighted (T2) images, with long TR and long TE, have better visualization of the tissue lesions. Fluid attenuated inversion recovery (FLAIR) is a T2-weighted contrast image with an inversion recovery sequence to improve the conspicuity of lesions in WM. The cerebrospinal fluid (CSF) looks black on FLAIR images and white on T2 images [1], [2].

In most clinical scenarios, multiple-modality MR images are the preferred choice as they provide more comprehensive information on disease diagnosis compared to single modal images [3]. For example, multimodal images are beneficial in unveiling subtle pathologic changes of the brain tissues, whereas single modal images make it hard to appreciate these details. However, different medical institutions are limited to their respective scanning equipment and imaging protocols [4], which may cause uncertainties in collecting multiple paired modalities MR images. Additionally, some modalities of MR images are unusable during data acquisition and storage due to artifacts, improper scanning parameters or the lost of some sequences [5], [6]. All of these conditions complicate the application of multimodal MR images in clinical diagnosis, creating uncertainties in fully exploiting the true efficacy of multimodal MR images. Moreover, rescanning the same subject in order to obtain the missing or unavailable modalities would be highly impractical. Apart from the high cost, the abnormalities detected on the subjects’ brains will change over time, making the new data no longer match the original data. Therefore, cross-modality synthesis of MR images has been conducted to address modality absence and inconsistency.

Image synthesis can be summarized as a process of generating new images similar to the original data by learning the image features of the original data domain. Since image synthesis can be used as an effective method for data augmentation and preprocessing steps of various downstream image processing tasks (i.e., segmentation, classification and so on), it has recently attracted a ton of attention, and there has been an increase in the amount of research on medical image synthesis. We present this research work in two categories of medical image synthesis: unconditional synthesis and cross-modality synthesis (a type of conditional synthesis).

(1) Unconditional image synthesis

Unconditional synthesis aims to learn the data distribution of the original images and generate new images satisfying the original distribution without any other conditional item [7]. Among the various image synthesis methods, algorithms based on generative deep learning have made breakthroughs in different applications. As one of the most representative approaches, generative adversarial networks (GANs) [8] broaden the boundaries of traditional patterns in medical imaging because of their ability to generate high-quality and realistic images [7], [9]. Calimeri et al. [10] used Laplacian generative adversarial networks (LAPGAN) [11] to progressively generate brain MR images from coarse features to fine features. The evaluation results produced by quantitative metrics and experts’ manual inspection showed its effectiveness in generating realistic brain MR images. The clinical demand for medical image resolution has encouraged researchers to try more GAN frameworks with the capacity to generate high-resolution images. Beers et al. [12] introduced progressively grown GANs (PGGAN) [13] into the synthesis of multi-modal MR images of gliomas as well as fundus photograph of vascular lesions, gradually generating images from a low resolution to the desired resolution.

In addition to GANs, a variety of deep generative models have emerged in the field of unconditional image synthesis, and some scholars have applied multiple generative models to the synthesis of MR images to obtain a comprehensive comparison. Zhuang et al. [14] compared gaussian mixture models (GMMs) [15], variational auto-encoders (VAEs) [16] and GANs on data augmentation of functional MRI (fMRI). They found that Improved Wasserstein GAN [17] framework and VAE with conditional variants could generate high quality, diverse and task-dependent brain images. Kwon et al. [18] leveraged VAEs and GANs to build a framework for normal and pathological brain MR image synthesis named auto-encoding GANs [19], possessing an additional code discriminator in the network structure. Their hybrid model succeeded in solving image blurriness and mode collapse problems when generating MR images. Unconditional image synthesis techniques are able to synthesize realistic and diverse images, which is increasingly used as a means of data augmentation. However, due to the fact that these models have no conditional terms in synthesizing images, they are often incompetent when targeted modality synthesis is required. In this case, cross-modality synthesis with conditional constraints is needed to achieve one-to-one correspondence of different modality images.

(2) Cross-modality image synthesis

Cross-modality image synthesis, also known as image modality translation, enables the conversion of one possible representation of the image content into another given enough training data, which is essentially a pixel-to-pixel mapping problem [7]. Machine learning approaches have been quickly introduced into the cross-modality image synthesis. Jog et al. [20] adopted random forest regression to predict the intensities of brain tissue contrasts under a given input. This approach was able to synthesize both T2-weighted and FLAIR images with fast computation. Chartsias et al. [6] proposed a fully convolutional neural network model via modality-invariant latent representation to synthesize multi-modality MR images given multi-modality input. It embedded the input modalities into a shared latent space, and transformed the fused representation into the target modality through a decoder.

Unsurprisingly, GAN-based methods are also widely studied in cross-modality image synthesis. Among this kind of research, the most prevalent methods are based on conditional GANs [21], which learn a representation from conditional input to target output. Dar et al. [22] utilized conditional GANs to conduct multi-contrast image synthesis; in other words, mutual translation of T1-weighted and T2-weighted images. They further improved the synthesis quality by adding neighboring cross-section images to the model. Yu et al. [23] focused on the influence of image texture details on the content structure of the synthesized image, and proposed an edge-aware GAN by introducing an edge detector for multi-modality brain MR image synthesis. Furthermore, they proposed a sample-adaptive GAN model to enhance the local space learning for individual samples [24]. They divided the model into two paths for learning: one for the global spatial mapping of every sample and another for the mapping of neighboring samples according to individual samples and fusing the target modality feature information, so as to flexibly adjust the model for cross-modality synthesis. Sharma and Hamarneh [25] implemented multi-modal GAN to supplement missing MRI pulse sequences. The multi-input multi-output (MIMO) model was able to synthesize missing pulse sequences from any combination of available pulse sequences.

Since cross-modality image synthesis is a pixel-to-pixel mapping problem [7], most conditional GAN-based methods focus on the one-to-one correspondence of pixels between the synthetic image and the reference modality image when designing the network. Among them, pix2pix [26] aims to maximize the pixel-wise intensity similarity between the synthetic image and the reference image, which requires two matching modality images as input during model training. Obtaining sufficient paired images of two modalities for model training in practice can be quite challenging, and overemphasizing the pixel-wise similarity may lead to ignorance of information like shape, texture, visual pattern and other style features. Based on the concept of image style, we introduced style transfer [27] into the conditional GAN-based cross-modality image synthesis framework. Although some studies have been carried out on style-based image translation [28], [29], [30], [31], the integration of style transfer and conditional GAN, taking advantage of their respective strengths, is a new and promising attempt in cross-modality synthesis of MR images. In the proposed hierarchical style transfer conditional GAN model, the similarity of synthetic image in terms of target modality style is enhanced through the fusion of image content and style features in different layers of the network. The integrated quality of the synthetic image produced by the proposed method is further improved with both pixel-level and style-level similarities, which is more in line with visual perception.

The main contributions of this work include the following:

(1) Style transfer technique is introduced into the conditional GAN architecture and a generative model with hierarchical feature mapping and fusion (ST-cGAN) is proposed. The proposed model receives two modalities as conditional input and extracts content and style features in different layers of the network. It applies style transfer and feature fusion to the hybrid features to obtain synthetic image with enhanced style, which makes the synthetic image closely resemble the target modality image from a stylistic point of view and effectively improves the image quality.

(2) Since noise or other artifacts may greatly affect the readability of MR images, it is essential to improve the robustness of the model to noise. The proposed model considers the effect of noise on image quality by adding random disturbance to different scales of the generator network during the image synthesis. The noise input turns out to be helpful for training a robust generator which is insensitive to noise.

(3) Image quality assessment is a complicated problem and different evaluation metrics may yield discrepant results. To provide a comprehensive and reasonable comparison between the proposed method and baseline methods, evaluation metrics of four representative dimensions (i.e., pixel-based, structure-based, feature-based and distribution-based) are utilized to assess the synthetic image. The comparison results verify the validity of the ST-cGAN method, while these metrics help reveal the strengths and weaknesses of each method from different perspectives and provide ideas for targeted improvements.

The rest of this article is organized as follows: An elaborate description of the methodologies as well as our proposed model are provided in Section 2. Section 3 introduces our experimental design and its implementations. The experimental results and detailed analysis are presented in Section 4. A brief conclusion along with a prospect for future work is given in Section 5.

Section snippets

Proposed method

In the original GANs proposed by Goodfellow et al. [8], the generator network $G$ is trained to learn a transformation from random noise $z$ in prior distribution to target data distribution. Meanwhile the discriminator network $D$ is trained to discriminate the generated samples from the real ones. Conditional GANs (cGAN) [21] learn a mapping from conditioned input to target output, which is different from generating data from random noise. Image-to-image translation [32], which generates target

Experiments and implementation

This section introduces the dataset used in this study and experimental settings. We also provide a description of the comparison methods and evaluation metrics. The experiments were implemented on 3.50 GHz CPU with 192 GB RAM, and NVIDIA Quadro P4000 8 GB GPU.

Results and discussion

This section reports the results and delivers an elaborate analysis. Section 4.1 reveals the experimental results regarding the IXI dataset, and a thorough analysis of these results is conducted from a statistical perspective in Section 4.2. In Section 4.3, we perform experiments to discuss the effect of noise on our synthetic image.

Conclusion

In this study, style transfer technique is introduced into conditional GAN architecture and the ST-cGAN model is proposed to address cross-modality image synthesis. ST-cGAN receives images of two modalities as conditional input and conducts hierarchical feature mapping and fusion. The style features of target modality image are extracted and applied to the content features of synthetic image by adaptive instance normalization, making the synthetic image and the target image possess more

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgment

This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. This research made use of the open-source IXI dataset (http://brain-development.org/ixi-dataset/), which hosts resources for the computational analysis of brain development and has made great contributions to the brain science research.

References (65)

JacksonE.F. et al.
A review of MRI pulse sequences and techniques in neuroimaging
Surg. Neurol.
(1997)
ClarkeL.P.
MRI segmentation: methods and applications
Magn. Reson. Imag.
(1995)
JogA. et al.
Random forest regression for magnetic resonance image synthesis
Med. Image Anal.
(2017)
AshburnerJ.T.
Computational anatomy with the SPM software
Magn. Reson. Imaging
(2009)
LiJ. et al.
Cross-modality synthesis aiding lung tumor segmentation on multi-modal MRI images
Biomed. Signal Process. Control
(2022)
SuH. et al.
Multilevel threshold image segmentation for COVID-19 chest radiography: A framework using horizontal and vertical multiverse optimization
Comput. Biol. Med.
(2022)
QinZ. et al.
A GAN-based image synthesis method for skin lesion classification
Comput. Meth. Programs Biomed.
(2020)
DadarM.
Validation of a regression technique for segmentation of white matter hyperintensities in Alzheimer’s disease
IEEE Trans. Med. Imaging
(2017)
KamnitsasK.
Unsupervised domain adaptation in brain lesion segmentation with adversarial networks
VarsavskyT. et al.
PIMMS: permutation invariant multi-modal segmentation

ChartsiasA. et al.

Multimodal MR synthesis via modality-invariant latent representation

IEEE Trans. Med. Imaging

(2018)

YiX. et al.

Generative adversarial network in medical imaging: a review

Med. Image Anal.

(2018)

GoodfellowI. et al.

Generative adversarial nets

KazeminiaS. et al.

GANs for medical image analysis

(2018)

CalimeriF. et al.

Biomedical data augmentation using generative adversarial neural networks

E.L. Denton, S. Chintala, A. Szalm, R. Fergus, Deep generative image models using a laplacian pyramid of adversarial...

BeersA. et al.

High-resolution medical image synthesis using progressively grown generative adversarial networks

(2018)

T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of GANs for improved quality, stability, and variation,...

ZhuangP. et al.

fMRI data augmentation via synthesis

RichardsonE. et al.

On GANs and GMMs

(2018)

D.P. Kingma, M. Welling, Auto-encoding variational bayes, in: Proceedings of the 2nd International Conference on...

GulrajaniI. et al.

Improved training of wasserstein GANs

(2017)

KwonG. et al.

Generation of 3D brain MRI using auto-encoding generative adversarial networks

(2019)

RoscaM.

Variational approaches for auto-encoding generative adversarial networks

(2017)

MirzaM. et al.

Conditional generative adversarial nets

(2014)

DarS.U. et al.

Image synthesis in multi-contrast MRI with conditional generative adversarial networks

IEEE Trans. Med. Imaging

(2019)

YuB. et al.

Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis

IEEE Trans. Med. Imaging

(2019)

YuB. et al.

Sample-adaptive GANs: linking global and local mappings for cross-modality MR image synthesis

IEEE Trans. Med. Imaging

(2020)

SharmaA. et al.

Missing MRI pulse sequence synthesis using multi-modal generative adversarial network

IEEE Trans. Med. Imaging

(2019)

P. Isola, J.Y. Zhu, T. Zhou, A.A. Efros, Image-to-image translation with conditional adversarial networks, in:...

GatysL.A. et al.

A neural algorithm of artistic style

(2015)

GatysL.A. et al.

Preserving color in neural artistic style transfer

(2016)

Cited by (9)

DSFF-GAN: A novel stain transfer network for generating immunohistochemical image of endometrial cancer
2024, Computers in Biology and Medicine
Immunohistochemistry (IHC) is a commonly used histological examination technique. Compared to Hematoxylin and Eosin (H&E) staining, it enables the examination of protein expression and localization in tissues, which is valuable for cancer treatment and prognosis assessment, such as the detection and diagnosis of endometrial cancer. However, IHC involves multiple staining steps, is time-consuming and expensive. One potential solution is to utilize deep learning networks to generate corresponding virtual IHC images from H&E images. However, the similarity of the IHC image generated by the existing methods needs to be further improved. In this work, we propose a novel dual-scale feature fusion (DSFF) generative adversarial network named DSFF-GAN, which comprises a cycle structure-color similarity loss, and DSFF block to constrain the model's training process and enhance its stain transfer capability. In addition, our method incorporates labeling information of positive cell regions as prior knowledge into the network to further improve the evaluation metrics. We train and test our model using endometrial cancer and publicly available breast cancer IHC datasets, and compare it with state-of-the-art methods. Compared to previous methods, our model demonstrates significant improvements in most evaluation metrics on both datasets. The research results show that our method further improves the quality of image generation and has potential value for the future clinical application of virtual IHC images.
Image harmonization: A review of statistical and deep learning methods for removing batch effects and evaluation metrics for effective harmonization
2023, NeuroImage
Magnetic resonance imaging and computed tomography from multiple batches (e.g. sites, scanners, datasets, etc.) are increasingly used alongside complex downstream analyses to obtain new insights into the human brain. However, significant confounding due to batch-related technical variation, called batch effects, is present in this data; direct application of downstream analyses to the data may lead to biased results. Image harmonization methods seek to remove these batch effects and enable increased generalizability and reproducibility of downstream results. In this review, we describe and categorize current approaches in statistical and deep learning harmonization methods. We also describe current evaluation metrics used to assess harmonization methods and provide a standardized framework to evaluate newly-proposed methods for effective harmonization and preservation of biological information. Finally, we provide recommendations to end-users to advocate for more effective use of current methods and to methodologists to direct future efforts and accelerate development of the field.
Artificial intelligence and multimodal data fusion for smart healthcare: topic modeling and bibliometrics
2024, Artificial Intelligence Review
Cross<sup>2</sup>SynNet: cross-device–cross-modal synthesis of routine brain MRI sequences from CT with brain lesion
2024, Magnetic Resonance Materials in Physics, Biology and Medicine
Automatic GAN-based MRI volume synthesis from US volumes: a proof of concept investigation
2023, Scientific Reports
Unsupervised synthesis of realistic coronary artery X-ray angiogram
2023, International Journal of Computer Assisted Radiology and Surgery

View all citing articles on Scopus

View full text

Style transfer in conditional GANs for cross-modality synthesis of brain magnetic resonance images

Highlights

Abstract

Graphical abstract

Introduction

Section snippets

Proposed method

Experiments and implementation

Results and discussion

Conclusion

Declaration of Competing Interest

Acknowledgment

Surg. Neurol.

Magn. Reson. Imag.

Med. Image Anal.

Magn. Reson. Imaging

Biomed. Signal Process. Control

Comput. Biol. Med.

Comput. Meth. Programs Biomed.

Validation of a regression technique for segmentation of white matter hyperintensities in Alzheimer’s disease

IEEE Trans. Med. Imaging

Unsupervised domain adaptation in brain lesion segmentation with adversarial networks

PIMMS: permutation invariant multi-modal segmentation

Multimodal MR synthesis via modality-invariant latent representation

IEEE Trans. Med. Imaging

Generative adversarial network in medical imaging: a review

Med. Image Anal.

Generative adversarial nets

GANs for medical image analysis

Biomedical data augmentation using generative adversarial neural networks

High-resolution medical image synthesis using progressively grown generative adversarial networks

fMRI data augmentation via synthesis

On GANs and GMMs

Improved training of wasserstein GANs

Generation of 3D brain MRI using auto-encoding generative adversarial networks

Variational approaches for auto-encoding generative adversarial networks

Conditional generative adversarial nets

Image synthesis in multi-contrast MRI with conditional generative adversarial networks

IEEE Trans. Med. Imaging

Ea-GANs: edge-aware generative adversarial networks for cross-modality MR image synthesis

IEEE Trans. Med. Imaging

Sample-adaptive GANs: linking global and local mappings for cross-modality MR image synthesis

IEEE Trans. Med. Imaging

Missing MRI pulse sequence synthesis using multi-modal generative adversarial network

IEEE Trans. Med. Imaging

A neural algorithm of artistic style

Preserving color in neural artistic style transfer