Adaptive sparse learning using multi-template for neurodegenerative disease diagnosis
Graphical Abstract
Introduction
Neurodegenerative diseases, such as Parkinson's disease (PD) (Marek et al., 2011) and Alzheimer's disease (AD) (Alzheimer's, 2015) are among the most common neurological disorders in the elderly people (Adeli et al., 2016). PD is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. AD is a chronic neurodegenerative disease that destroys memory and other important mental functions. Since the symptoms of these degenerative diseases of nervous systems appear progressively, patients in middle or late stage suffer from various inconveniences and endless pains, even life-threatening problems (Aerts et al., 2012). Apart from motor symptoms, non-motor symptoms such as depression, anxiety, and sleep disorders also degrade patients’ quality of life (Braak et al., 2003).
There is no known cure for neurodegenerative diseases to date (Nilashi et al., 2016; Schaffer et al., 2015). The current diagnosis mainly depends on clinical symptoms, the clinicians’ knowledge and experience (Nilashi et al., 2016; Postuma et al., 2015). Meanwhile, the conventional clinical symptoms adopted for diagnosis only occur when the relevant biomarkers already show the progression of the lesions (Weiner et al., 2006). Therefore, patients diagnosed using the traditional approaches are mostly at middle or late stage of such diseases. To accurately identify different stages and improve analytical effectiveness to reduce patients’ suffering, early automatic medical diagnosis is highly desirable for detecting their progression of these diseases.
Common prodromal stages of neurodegenerative diseases include scan without evidence of dopaminergic deficit (SWEDD) for PD (Marek et al., 2011) and mild cognitive impairment (MCI) for AD. MCI is further divided into light MCI (lMCI, who suffers from light MCI) and stable MCI (sMCI, whose symptoms are stable and will not progress to AD in 18 months) (Alzheimer's, 2015). SWEDD presents the clinical symptoms without obvious dopaminergic deficit, which has high potential of PD onset.
Unlike most previous binary classification tasks, we consider a multi-class classification problem (e.g., PD vs. SWEDD vs. NC) for practical applications. From practical clinical application point of view, it is more effective to build a multi-class classifier than binary classifier as only one diagnosis decision is required. Since these neurodegenerative diseases are progressive and multiple prodromal stages may occur, multiple prodromal stage patients can be recognized for targeted intervention treatment before the nervous system gets severely damaged.
To date, numerous studies show that neuroimaging techniques are promising for computer-aided diagnosis (CAD) of these diseases (Lei et al., 2017a; Lei et al., 2017b; Lei et al., 2017c; Prashanth et al., 2014; Wei et al., 2017; Zhu et al., 2016a, 2016b). For example, magnetic resonance imaging (MRI) and diffusion-weighted tensor imaging (DTI) can reveal structural abnormalities of the brain, while positron emission tomography (PET) or functional MRI (fMRI) can capture functional abnormalities of the brain (Salvatore et al., 2014). Many recent studies utilized neuro-images to predict and assess the stage of diseases by machine learning techniques. For instance, Rana et al. (Rana et al., 2014) proposed a machine learning approach for PD diagnosis with T1-weighted MRI images. Fung et al. (Fung and Stoeckel, 2007) utilized spatial information for feature selection and classification from single-photon emission computed tomography images for AD.
Generally, a typical CAD pipeline for neurodegenerative disorders consists of data acquisition, feature extraction, feature selection, and classification (Zhang and Shen, 2012). Information provided by neuroimages is often of high dimension, which leads to the overfitting issue with a limited sample size. To tackle this issue, a feature selection method is typically applied (Jothi and Hannah, 2016; Ozcift, 2012) to find a subset of features. Feature selection is capable of simplifying the prediction model and avoids the curse of dimensionality, thus enhancing the generalization ability of the prediction model. Methods like subspace learning can also achieve this goal by transforming the original data space into a low-dimensional space (Seeley et al., 2009; Wang et al., 2016). In regard to the interpretability of brain features, feature selection methods are preferable compared to subspace learning methods. However, it is reasonable to combine feature selection and subspace learning to build an interpretable as well as accurate disease diagnosis prediction model (Zhu et al., 2016b). Motivated by this, we combine both feature selection and subspace learning into a unified framework to select the most discriminative features for automatic diagnosis.
In addition, we build a feature selection model based on the sparse least square regression. Since we may encounter multiple different classification tasks of neurodegenerative diseases, different degrees of sparseness may be required according to the specific feature relationships and properties in different tasks. Adaptive sparse learning is an appealing method since it adapts the sparseness degree to achieve a better recognition rate (Grandvalet, 2002), and an adaptive strategy is employed to control the sparseness degree in our unified model. In other words, the ratio of zeros in a weight matrix can be adjusted according to the classification task.
It is known that single-template based methods obtain the simple morphometric representation of each brain image via a certain nonlinear registration method. In contrast, multi-template based methods are more promising to discover disease status and compare group difference (Liu et al., 2016b). It is suggested in the previous studies (Jin et al., 2015; Liu et al., 2016a; Min et al., 2014) that learning with multiple templates can boost diagnosis accuracy. For example, Min et al. (Min et al., 2014) utilized concatenated multi-template based features of each subject and achieved promising AD classification results. Multiple templates not only represent the brain information in a comprehensive way, but also capture the disease-related discriminative information (Liu et al., 2016b). Also, multi-template based methods can extract multiple feature sets of a subject derived from different templates (Jin et al., 2015; Liu et al., 2016a; Min et al., 2014), which can effectively reduce the negative impacts of registration errors and provide distinct yet complementary information to identify different disease status. It thus leads to more promising identification performance. Also, by concatenating the multi-template based features of each subject, more promising identification results can be achieved.
Inspired by this, we use multiple atlases with different sets of regions of interest (ROIs) to extract different sets of features from the brain images. These different features are fused together to enhance classification performance by constructing a more discriminative and larger space of features with a reduced dimension. Specifically, we use an automatic anatomical labeling (AAL) atlas (Tzouriomazoyer et al., 2002) for 90 and 116 regions and Craddock's spatially constrained spectral clustering atlas (Craddock et al., 2012) for 200 regions since the AAL atlas is the most widely used atlas for brain regions extraction. The available full brain regions of AAL template are 116 ROIs and 90 ROIs with cerebellar. The 200 ROIs are obtained from Craddock's spatially constrained spectral clustering atlas (Craddock et al., 2012). More ROIs increases the interpretability since more information may be provided. Craddock offers multiple ROIs larger than 200, but we select 200 ROIs as higher numbers of ROIs increases the difficulty of efficiently extracting features. Finally, we fuse these features together by linear concatenation.
On this note, we propose a multi-template based adaptive feature selection method to build a reliable classification model. Also, we integrate linear discriminative analysis (LDA) (Lin et al., 2010) and locally preserving projections (LPP) (Zhu et al., 2016b) to construct the most informative subspace with an adaptive sparse regularization (Zhang et al., 2011). LDA considers the global information by weighing the proportion of within-class-variance and between-class-variance, while LPP reflects the local information by finding the similarity relevance within each feature. With the help of global and local information in data, we select the most discriminative features and discard those irrelevant features to enhance the classification performance in the learned feature subspace (Zhang and Ye, 2011). Different from existing methods focusing only on binary classification task with single template, we simultaneously classify multiple different clinical statuses using multiple templates for practical clinical application.
The rest of this paper is organized as follows. Section II reviews various feature selection and subspace learning methods for neurodegenerative diseases recognition. Section III introduces the methodology of the proposed method. Experimental results are presented in Section IV. Discussions and conclusions are provided in Section V and VI, respectively.
Section snippets
Feature selection
Due to the challenge of high dimensionality and limited sample size, the overfitting problem could occur in data- driven analysis (Kong et al., 2014). To address this problem, most existing methods design a feature selection process to select most discriminative neuroimaging features or a sample selection process to discard the redundant samples (Fung and Stoeckel, 2007; Lei et al., 2017a). A l1-regularizer (i.e., a sparse term) was introduced in the estimation model for feature selection when
Methodology
The overview of our multi-class classification method is presented in Fig. 1. The preprocessing pipeline is the same as (Lei et al., 2017b). First, we preprocess the original brain T1-weighted MRI image by the statistical parametric mapping (SPM) tool for segmentation (Friston, 2003). Then, we extract the tissue volume in the segmented regions with AAL atlas. Then we calculate the corresponding tissue volume values as feature vectors and concatenate them linearly for feature representation.
We
Experiments and results
In our study, we use the two publicly available datasets, PPMI (Marek et al., 2011) and ADNI (Alzheimer's, 2015) to compare the proposed method with other widely used methods such as ElasticNet and LASSO (Tibshirani, 1996). We also compare the proposed method with other state-of-the-art feature selection methods applied for neurodegenerative disease diagnosis: multi-modal multi-task (M3T) (Zhang and Shen, 2012), joint sparse learning for classification and regression (JSL) (Lei et al., 2017b),
Discussions
We investigate the importance of the brain regions via the frequency of the selected ROIs by the proposed method using MRI images. To further study the relationship between brain regions and neurodegenerative disease, we attempt to identify the other top brain regions that are most correlated with other brain regions under the assumption that disease-related ROIs affect each other. We use the weighting matrix to calculate the Pearson correlation coefficient to represent the correlation among
Conclusions
In this paper, we introduce a multi-template adaptive sparse learning along with a multi-class classification model for neurodegenerative disease diagnosis. We use multiple brain parcellation atlases with different sets of regions of interest to fuse different features together. Specifically, we integrate the feature selection and subspace learning with a p-norm regularization. In the constructed subspace, we jointly consider the global and local information in the data space. To further
Declaration of Competing Interest
There are no conflicts of interest.
Acknowledgements
This work was supported partly by National Natural Science Foundation of China (Nos. 61871274, 61801305, 61806071 and 81571758), the Integration Project of Production Teaching and Research by Guangdong Province and Ministry of Education (No. 2012B091100495), Key Laboratory of Medical Image Processing of Guangdong Province (No. K217300003), Guangdong Province Key Laboratory of Popular High Performance Computers (No. 2017B03031407), Guangdong Pearl River Talents Plan (2016ZT06S220), Shenzhen
References (52)
- et al.
Joint Feature-Sample Selection and Robust Diagnosis of Parkinson’s Disease From MRI Data
Neuroimage
(2016) - et al.
Staging of brain pathology related to sporadic parkinson’s disease
Neurobiol. Aging
(2003) - et al.
Abnormal amygdala function in Parkinson’s disease patients and its relationship to depression
J. Affect. Disord.
(2015) - et al.
Joint detection and clinical score prediction in Parkinson’s disease via multi-modal sparse learning
Expert Syst. Appl.
(2017) - et al.
Automated classification of multi-spectral MR images using linear discriminant analysis
Comput. Med. Imaging Gr.
(2010) - et al.
The Parkinson progression marker initiative (PPMI)
Prog. Neurobiol.
(2011) - et al.
Automatic classification and prediction models for early Parkinson’s disease diagnosis from SPECT imaging
Expert Syst. Appl.
(2014) - et al.
Machine learning on brain MRI data for differential diagnosis of Parkinson's disease and progressive supranuclear palsy
Journal of Neuroscience Methods
(2014) - et al.
Biomarkers in the diagnosis and prognosis of Alzheimer’s disease
J. Assoc. Lab. Autom.
(2015) - et al.
Neurodegenerative diseases target large-scale human brain networks
Neuron
(2009)
Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain
Neuroimage
Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease
Neuroimage
Multimodal classification of Alzheimer’s disease and mild cognitive impairment
Neuroimage
Improving the Diagnostic Accuracy in Parkinsonism: A Three-Pronged Approach
Practical Neurology
Alzheimer's disease facts and figures
Alzheimer's & dementia: the j. the Alzheimer's Association
Cerebrospinal fluid and plasma biomarkers in Alzheimer disease
Nat. Rev. Neurol.
Accelerated gradient method for multi-task sparse learning problem
A whole brain FMRI atlas generated via spatially constrained spectral clustering
Hum. Brain Mapp.
Prognosis and diagnosis of Parkinson's disease using multi-task learning
Cerebrospinal fluid biomarkers for the diagnosis and prognosis of Parkinson's disease: protocol for a systematic review and individual participant data meta-analysis
BMJ Open
Statistical Parametric Mapping. In Neuroscience Databases
SVM feature selection for classification of SPECT images of Alzheimer’s disease using spatial information
Knowl. Inf. Syst.
Adaptive scaling for feature selection in SVMs.Paper presented at the
CSF biomarkers and clinical progression of parkinson’s disease
Neurology
Low-rank tensor subspace learning for RGB-D action recognition
IEEE Trans. Image Process.
Cited by (37)
Constructing hierarchical attentive functional brain networks for early AD diagnosis
2024, Medical Image AnalysisMulti-modal imaging genetics data fusion by deep auto-encoder and self-representation network for Alzheimer's disease diagnosis and biomarkers extraction
2024, Engineering Applications of Artificial IntelligenceDisease2Vec: Encoding Alzheimer's progression via disease embedding tree
2024, Pharmacological ResearchMultimodal transformer network for incomplete image generation and diagnosis of Alzheimer's disease
2023, Computerized Medical Imaging and GraphicsAnatomy preserving GAN for realistic simulation of intraoperative liver ultrasound images
2023, Computer Methods and Programs in BiomedicineDeep multimodality-disentangled association analysis network for imaging genetics in neurodegenerative diseases
2023, Medical Image Analysis