Brain graph synthesis by dual adversarial domain alignment and target graph prediction from a source graph
Graphical abstract
Introduction
One major objective of existing machine learning-based methods in medical imaging aims to alleviate the high costs of acquiring multiple medical scans as well as handling medical datasets with incomplete imaging modalities. For instance, a subject might have a magnetic resonance imaging (MRI) scan and lacks a positron emission tomography (PET) scan. However, feeding incomplete multimodal data into a learning-based framework for early disease diagnosis is challenged by missing multimodal medical images, which can provide a more holistic understanding of the underlying mechanisms of the target disease when available. To handle this problem, some of the existing works discarded samples with missing data. However, such techniques led to reducing the performance of the predictive model since it learns from a limited number of observations. Existing methods aiming to solve this problem can be categorized into machine learning based and deep learning based approaches. For instance, in the first category (Huynh et al., 2016) proposed a voxel estimation method that used structured random forest algorithm to predict a CT image from an MRI image. Another learning based work (Jog et al., 2013) proposed to synthesize a T2-weighted MRI data from a T1-weighted data using an ensemble of regression trees.
For the deep learning based category, Bano et al. (2018) designed a fully convolutional network (XmoNet) for cross-modality MR image synthesis. Li et al. (2014) used convolutional neural networks (CNN) to predict positron-emission tomography (PET) image of a specific sample from MRI image. In a follow-up work, Ben-Cohen et al. (2019) combined a fully convolutional network with a conditional Generative Adversarial Network (GAN) to predict PET from CT. A typical GAN (Goodfellow et al., 2014) consists of two neural networks: a generator trained to synthesize an output that approximates the real data distribution, and a discriminator trained to differentiate between the fake and real images. With its remarkable synthesis potential, GAN has been used in a variety of medical imaging applications (Yi et al., 2018) including missing data imputation task. For instance, Olut et al. (2018) used GAN to generate missing Magnetic Resonance Angiography (MRA) from T1- and T2-weighted MRI images. However, most of the existing methods are only applied to Euclidian structured data such as MRI scans and electrocardiogram (ECG) signals (Cho et al., 2018). Hence, they might fail in handling non-Euclidian structured data or ‘geometric data’ types such as graphs and manifolds (Bronstein et al., 2017).
Recently, the nascent field of geometric deep learning (GDL) has cross-pollinated network neuroscience, where GDL architectures were trained on brain graphs or connectomes to diagnose neurological diseases. The brain connectome is a graph representation of biological activity across a set of anatomical regions of interest (ROIs) in the brain. For instance, Ktena et al. (2017) used Graph Convolution Network (GCN) (Kipf and Welling, 2016) to learn a similarity metric between two functional brain graphs extracted from resting-state fMRI (rs-fMRI) data for Autism Spectrum Disorder (ASD) diagnosis. Later on, Parisot et al. (2017) proposed to predict the disease state (healthy or affected) of a subject from a partially labeled graph using GCN. Nodes of the brain graph represent functional brain connectivities of a subject extracted from rs-fMRI images and edges represent the similarity between subjects using their brain connectomes and their phenotypic information (age and gender). Another recent work (Arslan et al., 2018) introduced a gender prediction framework (i.e., male or female) based on functional brain graphs extracted from rs-fMRI. Using GCN, they selected the most relevant ROIs for gender classification. While these frameworks presented promising results on brain graphs, they overlooked the problem of ‘graph synthesis’. Especially, predicting a target brain graph from a source graph where each is derived from different metrics (i.e, they have different statistical distributions) remains largely unexplored.
However, to make such a cross-domain prediction there is a need to handle the problem of domain fracture resulting in the difference in distribution between the source and target domains. Several works aimed to solve this problem by proposing a GAN-based framework. For instance, Pan et al. (2018b) used cycle-consistent GAN to predict PET images from MRI data. They used a bi-directional domain mapping (i.e., domain alignment) where they first mapped the MRI source domain to the PET target domain and then learned the reverse mapping. The synthesized PET data were used for early Alzheimer’s Disease diagnosis. Additionally, Yang et al. (2018) assumes that cycle GAN does not have a constraint between the generated target image and its ground truth target image. So they added a structure-consistency loss to the original cycleGAN and adopted it to predict MRI data from CT data. However, these GAN-based methods focused mainly on synthesizing medical images rather than geometric data while many works demonstrated the ability of GAN in accurately learning from graphs. (Wang et al., 2018) recently introduced GraphGAN, a graph embedding method where graphs were projected into a low-dimensional space. As existing graph representation methods were rooted in generative or discriminative learning frameworks, this work consists of a GAN-based method that combines both classes. Moreover, Liu et al. (2017) demonstrated how GAN can learn topological features of any kind of graph. Specifically, they proposed a graph topology interpolator method to divide a graph into multiple subgraphs. Then, in order to better capture topological features, the subgraphs were fed to GAN.
Several neuroscientists have long suspected that abnormal mental behaviour shown in disordered subjects correlate to specific connectivity features of the brain. For instance, Mahjoub et al. (2018) proposed a brain graph-based representation named multiplex to distinguish between late MCI and AD subjects and detected biomarkers which are morphological connectivities that fingerprint the difference between both stages. More recently, Mhiri et al. (2020) proposed a high-resolution brain graph generation framework and discovered several discriminative functional connectivities in the produced brain graphs. We notice here that such results were not derived from a flat data representation (e.g, MRI), but from a deeper one named brain graph which is a wiring map of the neural connections in a human brain (Fornito et al., 2013). This is explicable because a connectional biomarker of a neurological disorder means that if an ROI is considered as the most affected brain region at a specific stage of that disorder the other ROI connected to it will also be affected. Hence, once equipped with the ability to construct the brain graphs, neuroscientists will be able to discover more biomarkers which help them develop more personalized treatments and advance the surgical interventions. However, to date they have lacked complete medical dataset necessary to fully investigate these hypotheses since the existing ones usually have missing modalities. Thus, to circumvent the need to acquire multiple brain modalities such as functional MRI or diffusion MRI for the purpose of extracting respectively a functional brain graph and a structural brain graph, a brain graph synthesis framework need to be developed. To fill this gap, we unprecedentedly propose a GAN-based framework for predicting target brain graphs from a single source graph. Typically, predicting brain connectomes has several applications in network neuroscience. First, different connectional aspects of the brain (i.e, source and target brain graphs) can provide complementary information of the whole brain which helps boost brain disease diagnosis as in Liu et al. (2016) and Wang et al. (2020). For instance, Liu et al. (2016) proposed a connectomic feature selection strategy to early identify the high-grade glioma (HGG) subjects with survival time over 650. The most reliable features of structural and functional brain graphs respectively derived from diffusion tensor imaging (DTI) and rs-fMRI were selected then fed into an SVM classifier to predict the disease outcome. Another recent paper (Wang et al., 2020) proposed learning-based framework for multi-class ASD classification using multi-modal brain data extracted from rs-fMRI. More specifically, a sparse representation classifier (SRC) (Wright et al., 2008) was leveraged to classify multi-view ASD connectivity features which are extracted from white matter and gray matter data. Second, the brain graph synthesis task helps understand the holistic connectional map of the brain. If the generated graphs are very reliable and biologically sound, one can use them to create integral connectional maps of the brain called connectional brain template (CBT) (Dhifallah et al., 2019). Mainly, it aims to produce a normalized connectional representation of a population using multi-view brain graphs. The generated CBTs from four connectomic datasets were shown to be effective in preserving the connectivity patterns of the population. Such concept was recently leveraged in Goktas et al. (2020) to predict the evolution trajectory of a brain graph where an adversarial conncetome encoder learned the CBT’s embeddings which are passed on to a sample selection block for the target prediction task.
So far, we have identified only a single work on brain graph synthesis (Zhu and Rekik, 2018) which proposed a machine learning-based framework leveraging multi-kernel manifold learning (MKML) technique to predict multiple target brain graphs nested in different domains from a single source brain graph. This landmark work handled the domain shift problem by mapping each pair of source and target brain graphs onto a shared space where their distributions are aligned and domain shift is reduced using canonical correlation analysis (CCA). Although promising, the existing image synthesis frameworks based on deep learning and the CCA-based framework (Zhu and Rekik, 2018) are limited to regarding the domain shift between the source and target data and multimodal image prediction as independent tasks. More specifically, they solved both problems in a sequential manner: they first learn how to align the source data to the target data, second they predict the target data using the aligned source-to-target domains (Fig. 1A).
To address these limitations, we propose a unified Learning-guided Graph Dual Adversarial Domain Alignment (LG-DADA) architecture, which predicts a target brain graph from a source graph while aligning both domains. Specifically, we leverage the adversarially-regularized generative autoencoder (ARGA) proposed in Pan et al. (2018a) which extended the concept of autoencoder and GAN (Goodfellow et al., 2014) to graphs. ARGA comprises a generator defined as a GCN (Kipf and Welling, 2016) and a discriminator defined as a multilinear perceptron. However, ARGA was originally designed for a graph embedding task and not for graph prediction and domain alignment. In this work, we propose to extend it for jointly solving the domain shift problem and graph prediction. Although ARGA is a good starting point for solving our problem, it is an instance of GANs, which generally suffer from the mode collapse problem (Goodfellow, 2016). This issue occurs when training the generator . Ideally, one would learn a generalizable generator, which is able to generate diverse target samples covering well the distribution of the target data domain. Still, in practice, the generator ends up producing graphs that approximate a few examples of target graphs (i.e., one mode), thereby identifying a single mode of the real data distribution. To circumvent this problem, we propose to cluster the source with heterogeneous distribution into different homogeneous clusters, where a cluster-specific generator is constrained to generating a specific mode of the target data distribution. Moreover, we aim to simultaneously bridge the distribution shift between source and target graph domains and synthesize the target graph by an alternative bidirectional learning where bridging the domain shift step improves the target graph prediction and vice versa in an iterative progressive manner (Fig. 1B). Fundamentally, our LG-DADA framework has four stages:
- (1)
Feature extraction and clustering. We represent both source and target brain graphs, each encoded in a matrix, by feature vectors. Next, to better learn the inherent statistical distribution of the source data to align with the target, we propose to cluster the source graphs into different homogeneous groups. Then, for each cluster we proceed to the next three steps for target graph prediction.
- (2)
Adversarial domain alignment. We propose to align the source domain to the target domain using training samples. This prediction-independent domain alignment is regularized using one discriminator that maps the distribution of the source domain to the target domain.
- (3)
Dual adversarial regularization. We propose a prediction-dependent domain alignment where we learn a source embedding for training and testing samples by alternating between two discriminators: the first one matches the distribution of the embedded source graphs with the distribution of the original source graphs, and the second one enforces the embedded source distribution to match the distribution of the predicted target graphs of training subjects.
- (4)
Target brain graph prediction. To predict the target graph we first learn a connectomic manifold of the source embedding using the training and testing subjects and the aligned source-to-target graph embedding using the training subjects. To predict the target brain graph of a testing subject, we select the closest neighboring source graphs to the testing graph then average their corresponding target graphs.
Section snippets
Graph synthesis
So far, seminal geometric deep learning works have been successfully leveraged in a few undirected graph synthesis tasks. Some studies employed Recurrent Neural Network (RNN) to sequentially generate subgraphs consisting in a subset of nodes and their connectivities from the whole graph (Su, Hajimirsadeghi, Mori, Liao, Li, Song, Wang, Hamilton, Duvenaud, Urtasun, Zemel, 2019). In the first work (Su et al., 2019), two RNNs were used to learn the nodes embeddings and another RNN was used to
Methodology
In this section, we detail our joint graph prediction and domain alignment framework. We illustrate in Fig. 2 the four proposed steps: (1) extraction and clustering of source and target brain graphs, (2) alignment of the source to the target domain, (3) dual adversarial regularization of source graph embedding, and (4) prediction of target brain graph. For easy reference, we summarize the major mathematical notations in Table 1.
Connectomic dataset
A set of 150 structural T1-w MRI data for 75 ASD and 75 NC subjects extracted from Autism Brain Imaging Data Exchange (ABIDE2) public dataset was used. Both hemispheres of each subject were reconstructed using FreeSurfer (Fischl, 2012). Then using Desikan–Killiany Atlas we parcellated the hemispheres into 35 anatomical regions. Each subject has three morphological brain graphs (MBG) using the following cortical measurements in each hemisphere:
Conclusion
In this work, we introduced LG-DADA, a geometric deep learning framework for target brain graph prediction from a single source graph. Our key contribution consists in designing: (1) a domain alignment of source domain to the target domain by learning their latent representations, and (2) a dual adversarial regularization that synergistically learns a source embedding of training and testing brain graphs using two discriminators and predicts the training target graphs. We evaluated our
CRediT authorship contribution statement
Alaa Bessadok: Methodology, Software, Formal analysis, Validation, Visualization, Writing - original draft. Mohamed Ali Mahjoub: Supervision, Writing - original draft. Islem Rekik: Conceptualization, Supervision, Methodology, Resources, Writing - review & editing, Funding acquisition.
Declaration of Competing Interest
Authors declare that they have no conflict of interest.
Acknowledgements
This project has been funded by the 2232 International Fellowship for Outstanding Researchers Program of TUBITAK (Project No:118C288, http://basira-lab.com/reprime/). However, all scientific contributions made in this project are owned and approved solely by the authors.
References (65)
- et al.
Cross-modality synthesis from CT to PET using FCN and GAN networks for improved automated lesion detection
Eng. Appl. Artif. Intell.
(2019) - et al.
Clustering-based multi-view network fusion for estimating brain network atlases of healthy and disordered populations
J. Neurosci. Methods
(2019) Freesurfer
Neuroimage
(2012)- et al.
Graph analysis of the human connectome: promise, progress, and pitfalls
Neuroimage
(2013) - et al.
Brain graph super-resolution for boosting neurological disorder diagnosis using unsupervised multi-topology connectional brain template learning
Med. Image Anal.
(2020) - et al.
Joint functional brain network atlas estimation and feature selection for neurological disorder diagnosis with application to Autism
Med. Image Anal.
(2020) - et al.
Learning the number of neurons in deep networks
Advances in Neural Information Processing Systems
(2016) - et al.
Joint pairing and structured mapping of convolutional brain morphological multiplexes for early dementia diagnosis
Brain Connect.
(2018) - et al.
Wasserstein generative adversarial networks
International Conference on Machine Learning
(2017) - Arslan, S., Ktena, S. I., Glocker, B., Rueckert, D., Graph saliency maps through spectral convolutional networks:...
Adversarial connectome embedding for mild cognitive impairment identification using cortical morphological networks
International Workshop on Connectomics in Neuroimaging
XmoNet: a fully convolutional network for cross-modality MR image inference
International Workshop on PRedictive Intelligence In MEdicine
Geometric deep learning: going beyond euclidean data
IEEE Signal Process. Mag.
Prediction to atrial fibrillation using deep convolutional neural networks
International Workshop on PRedictive Intelligence In MEdicine
Stargan: unified generative adversarial networks for multi-domain image-to-image translation
Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition
The relation between pearson’s correlation coefficient r and salton’s cosine measure
J. Am. Soc. Inf. Sci. Technol.
Residual embedding similarity-based network selection for predicting brain network evolution trajectory from a single observation
International Workshop on PRedictive Intelligence In MEdicine
Generative adversarial nets
Advances in neural information processing systems
Canonical correlation analysis: an overview with application to learning methods
Neural Comput.
Cross-modality image synthesis from unpaired data using CycleGAN
International Workshop on Simulation and Synthesis in Medical Imaging
Cycada: cycle-consistent adversarial domain adaptation
International Conference on Machine Learning
Estimating CT image from MRI data using structured random forest and auto-context model
IEEE Trans. Med. Imaging
Magnetic resonance image synthesis through patch regression
2013 IEEE 10th International Symposium on Biomedical Imaging
Principal component analysis: a review and recent developments
Philos. Trans. R. Soc. A
Distance metric learning using graph convolutional networks: application to functional brain networks
International Conference on Medical Image Computing and Computer-Assisted Intervention
Deep learning based imaging data completion for improved brain disease diagnosis
International Conference on Medical Image Computing and Computer-Assisted Intervention
Efficient graph generation with graph recurrent attention networks
Advances in Neural Information Processing Systems
Cited by (0)
- 1
GitHub code: https://github.com/basiralab/LG-DADA