Adaptive sparse learning using multi-template for neurodegenerative disease diagnosis

https://doi.org/10.1016/j.media.2019.101632Get rights and content

Highlights

  • An adaptive feature learning framework using multiple templates.

  • An adaptively chosen sparse degree.

  • Joint linear discriminative analysis and locally preserving projections.

  • Identification of the highly relevant brain regions.

Abstract

Neurodegenerative diseases are excessively affecting millions of patients, especially elderly people. Early detection and management of these diseases are crucial as the clinical symptoms take years to appear after the onset of neuro-degeneration. This paper proposes an adaptive feature learning framework using multiple templates for early diagnosis. A multi-classification scheme is developed based on multiple brain parcellation atlases with various regions of interest. Different sets of features are extracted and then fused, and a feature selection is applied with an adaptively chosen sparse degree. In addition, both linear discriminative analysis and locally preserving projections are integrated to construct a least square regression model. Finally, we propose a feature space to predict the severity of the disease by the guidance of clinical scores. Our proposed method is validated on both Alzheimer's disease neuroimaging initiative and Parkinson's progression markers initiative databases. Extensive experimental results suggest that the proposed method outperforms the state-of-the-art methods, such as the multi-modal multi-task learning or joint sparse learning. Our method demonstrates that accurate feature learning facilitates the identification of the highly relevant brain regions with significant contribution in the prediction of disease progression. This may pave the way for further medical analysis and diagnosis in practical applications.

Introduction

Neurodegenerative diseases, such as Parkinson's disease (PD) (Marek et al., 2011) and Alzheimer's disease (AD) (Alzheimer's, 2015) are among the most common neurological disorders in the elderly people (Adeli et al., 2016). PD is a long-term degenerative disorder of the central nervous system that mainly affects the motor system. AD is a chronic neurodegenerative disease that destroys memory and other important mental functions. Since the symptoms of these degenerative diseases of nervous systems appear progressively, patients in middle or late stage suffer from various inconveniences and endless pains, even life-threatening problems (Aerts et al., 2012). Apart from motor symptoms, non-motor symptoms such as depression, anxiety, and sleep disorders also degrade patients’ quality of life (Braak et al., 2003).

There is no known cure for neurodegenerative diseases to date (Nilashi et al., 2016; Schaffer et al., 2015). The current diagnosis mainly depends on clinical symptoms, the clinicians’ knowledge and experience (Nilashi et al., 2016; Postuma et al., 2015). Meanwhile, the conventional clinical symptoms adopted for diagnosis only occur when the relevant biomarkers already show the progression of the lesions (Weiner et al., 2006). Therefore, patients diagnosed using the traditional approaches are mostly at middle or late stage of such diseases. To accurately identify different stages and improve analytical effectiveness to reduce patients’ suffering, early automatic medical diagnosis is highly desirable for detecting their progression of these diseases.

Common prodromal stages of neurodegenerative diseases include scan without evidence of dopaminergic deficit (SWEDD) for PD (Marek et al., 2011) and mild cognitive impairment (MCI) for AD. MCI is further divided into light MCI (lMCI, who suffers from light MCI) and stable MCI (sMCI, whose symptoms are stable and will not progress to AD in 18 months) (Alzheimer's, 2015). SWEDD presents the clinical symptoms without obvious dopaminergic deficit, which has high potential of PD onset.

Unlike most previous binary classification tasks, we consider a multi-class classification problem (e.g., PD vs. SWEDD vs. NC) for practical applications. From practical clinical application point of view, it is more effective to build a multi-class classifier than binary classifier as only one diagnosis decision is required. Since these neurodegenerative diseases are progressive and multiple prodromal stages may occur, multiple prodromal stage patients can be recognized for targeted intervention treatment before the nervous system gets severely damaged.

To date, numerous studies show that neuroimaging techniques are promising for computer-aided diagnosis (CAD) of these diseases (Lei et al., 2017a; Lei et al., 2017b; Lei et al., 2017c; Prashanth et al., 2014; Wei et al., 2017; Zhu et al., 2016a, 2016b). For example, magnetic resonance imaging (MRI) and diffusion-weighted tensor imaging (DTI) can reveal structural abnormalities of the brain, while positron emission tomography (PET) or functional MRI (fMRI) can capture functional abnormalities of the brain (Salvatore et al., 2014). Many recent studies utilized neuro-images to predict and assess the stage of diseases by machine learning techniques. For instance, Rana et al. (Rana et al., 2014) proposed a machine learning approach for PD diagnosis with T1-weighted MRI images. Fung et al. (Fung and Stoeckel, 2007) utilized spatial information for feature selection and classification from single-photon emission computed tomography images for AD.

Generally, a typical CAD pipeline for neurodegenerative disorders consists of data acquisition, feature extraction, feature selection, and classification (Zhang and Shen, 2012). Information provided by neuroimages is often of high dimension, which leads to the overfitting issue with a limited sample size. To tackle this issue, a feature selection method is typically applied (Jothi and Hannah, 2016; Ozcift, 2012) to find a subset of features. Feature selection is capable of simplifying the prediction model and avoids the curse of dimensionality, thus enhancing the generalization ability of the prediction model. Methods like subspace learning can also achieve this goal by transforming the original data space into a low-dimensional space (Seeley et al., 2009; Wang et al., 2016). In regard to the interpretability of brain features, feature selection methods are preferable compared to subspace learning methods. However, it is reasonable to combine feature selection and subspace learning to build an interpretable as well as accurate disease diagnosis prediction model (Zhu et al., 2016b). Motivated by this, we combine both feature selection and subspace learning into a unified framework to select the most discriminative features for automatic diagnosis.

In addition, we build a feature selection model based on the sparse least square regression. Since we may encounter multiple different classification tasks of neurodegenerative diseases, different degrees of sparseness may be required according to the specific feature relationships and properties in different tasks. Adaptive sparse learning is an appealing method since it adapts the sparseness degree to achieve a better recognition rate (Grandvalet, 2002), and an adaptive strategy is employed to control the sparseness degree in our unified model. In other words, the ratio of zeros in a weight matrix can be adjusted according to the classification task.

It is known that single-template based methods obtain the simple morphometric representation of each brain image via a certain nonlinear registration method. In contrast, multi-template based methods are more promising to discover disease status and compare group difference (Liu et al., 2016b). It is suggested in the previous studies (Jin et al., 2015; Liu et al., 2016a; Min et al., 2014) that learning with multiple templates can boost diagnosis accuracy. For example, Min et al. (Min et al., 2014) utilized concatenated multi-template based features of each subject and achieved promising AD classification results. Multiple templates not only represent the brain information in a comprehensive way, but also capture the disease-related discriminative information (Liu et al., 2016b). Also, multi-template based methods can extract multiple feature sets of a subject derived from different templates (Jin et al., 2015; Liu et al., 2016a; Min et al., 2014), which can effectively reduce the negative impacts of registration errors and provide distinct yet complementary information to identify different disease status. It thus leads to more promising identification performance. Also, by concatenating the multi-template based features of each subject, more promising identification results can be achieved.

Inspired by this, we use multiple atlases with different sets of regions of interest (ROIs) to extract different sets of features from the brain images. These different features are fused together to enhance classification performance by constructing a more discriminative and larger space of features with a reduced dimension. Specifically, we use an automatic anatomical labeling (AAL) atlas (Tzouriomazoyer et al., 2002) for 90 and 116 regions and Craddock's spatially constrained spectral clustering atlas (Craddock et al., 2012) for 200 regions since the AAL atlas is the most widely used atlas for brain regions extraction. The available full brain regions of AAL template are 116 ROIs and 90 ROIs with cerebellar. The 200 ROIs are obtained from Craddock's spatially constrained spectral clustering atlas (Craddock et al., 2012). More ROIs increases the interpretability since more information may be provided. Craddock offers multiple ROIs larger than 200, but we select 200 ROIs as higher numbers of ROIs increases the difficulty of efficiently extracting features. Finally, we fuse these features together by linear concatenation.

On this note, we propose a multi-template based adaptive feature selection method to build a reliable classification model. Also, we integrate linear discriminative analysis (LDA) (Lin et al., 2010) and locally preserving projections (LPP) (Zhu et al., 2016b) to construct the most informative subspace with an adaptive sparse regularization (Zhang et al., 2011). LDA considers the global information by weighing the proportion of within-class-variance and between-class-variance, while LPP reflects the local information by finding the similarity relevance within each feature. With the help of global and local information in data, we select the most discriminative features and discard those irrelevant features to enhance the classification performance in the learned feature subspace (Zhang and Ye, 2011). Different from existing methods focusing only on binary classification task with single template, we simultaneously classify multiple different clinical statuses using multiple templates for practical clinical application.

The rest of this paper is organized as follows. Section II reviews various feature selection and subspace learning methods for neurodegenerative diseases recognition. Section III introduces the methodology of the proposed method. Experimental results are presented in Section IV. Discussions and conclusions are provided in Section V and VI, respectively.

Section snippets

Feature selection

Due to the challenge of high dimensionality and limited sample size, the overfitting problem could occur in data- driven analysis (Kong et al., 2014). To address this problem, most existing methods design a feature selection process to select most discriminative neuroimaging features or a sample selection process to discard the redundant samples (Fung and Stoeckel, 2007; Lei et al., 2017a). A l1-regularizer (i.e., a sparse term) was introduced in the estimation model for feature selection when

Methodology

The overview of our multi-class classification method is presented in Fig. 1. The preprocessing pipeline is the same as (Lei et al., 2017b). First, we preprocess the original brain T1-weighted MRI image by the statistical parametric mapping (SPM) tool for segmentation (Friston, 2003). Then, we extract the tissue volume in the segmented regions with AAL atlas. Then we calculate the corresponding tissue volume values as feature vectors and concatenate them linearly for feature representation.

We

Experiments and results

In our study, we use the two publicly available datasets, PPMI (Marek et al., 2011) and ADNI (Alzheimer's, 2015) to compare the proposed method with other widely used methods such as ElasticNet and LASSO (Tibshirani, 1996). We also compare the proposed method with other state-of-the-art feature selection methods applied for neurodegenerative disease diagnosis: multi-modal multi-task (M3T) (Zhang and Shen, 2012), joint sparse learning for classification and regression (JSL) (Lei et al., 2017b),

Discussions

We investigate the importance of the brain regions via the frequency of the selected ROIs by the proposed method using MRI images. To further study the relationship between brain regions and neurodegenerative disease, we attempt to identify the other top brain regions that are most correlated with other brain regions under the assumption that disease-related ROIs affect each other. We use the weighting matrix W¯ to calculate the Pearson correlation coefficient to represent the correlation among

Conclusions

In this paper, we introduce a multi-template adaptive sparse learning along with a multi-class classification model for neurodegenerative disease diagnosis. We use multiple brain parcellation atlases with different sets of regions of interest to fuse different features together. Specifically, we integrate the feature selection and subspace learning with a p-norm regularization. In the constructed subspace, we jointly consider the global and local information in the data space. To further

Declaration of Competing Interest

There are no conflicts of interest.

Acknowledgements

This work was supported partly by National Natural Science Foundation of China (Nos. 61871274, 61801305, 61806071 and 81571758), the Integration Project of Production Teaching and Research by Guangdong Province and Ministry of Education (No. 2012B091100495), Key Laboratory of Medical Image Processing of Guangdong Province (No. K217300003), Guangdong Province Key Laboratory of Popular High Performance Computers (No. 2017B03031407), Guangdong Pearl River Talents Plan (2016ZT06S220), Shenzhen

References (52)

  • N. Tzouriomazoyer et al.

    Automated anatomical labeling of activations in SPM using a macroscopic anatomical parcellation of the MNI MRI single-subject brain

    Neuroimage

    (2002)
  • D. Zhang et al.

    Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer’s disease

    Neuroimage

    (2012)
  • D. Zhang et al.

    Multimodal classification of Alzheimer’s disease and mild cognitive impairment

    Neuroimage

    (2011)
  • M.B. Aerts et al.

    Improving the Diagnostic Accuracy in Parkinsonism: A Three-Pronged Approach

    Practical Neurology

    (2012)
  • A. Alzheimer's

    Alzheimer's disease facts and figures

    Alzheimer's & dementia: the j. the Alzheimer's Association

    (2015)
  • Belkin, M., & Niyogi, P. (2003). Laplacian Eigenmaps for dimensionality reduction and data representation: MIT...
  • K. Blennow et al.

    Cerebrospinal fluid and plasma biomarkers in Alzheimer disease

    Nat. Rev. Neurol.

    (2010)
  • X. Chen et al.

    Accelerated gradient method for multi-task sparse learning problem

  • R.C. Craddock et al.

    A whole brain FMRI atlas generated via spatially constrained spectral clustering

    Hum. Brain Mapp.

    (2012)
  • S. Emrani et al.

    Prognosis and diagnosis of Parkinson's disease using multi-task learning

  • P. Eusebi et al.

    Cerebrospinal fluid biomarkers for the diagnosis and prognosis of Parkinson's disease: protocol for a systematic review and individual participant data meta-analysis

    BMJ Open

    (2017)
  • K.J. Friston

    Statistical Parametric Mapping. In Neuroscience Databases

    (2003)
  • G. Fung et al.

    SVM feature selection for classification of SPECT images of Alzheimer’s disease using spatial information

    Knowl. Inf. Syst.

    (2007)
  • Y. Grandvalet

    Adaptive scaling for feature selection in SVMs.Paper presented at the

  • S. Hall et al.

    CSF biomarkers and clinical progression of parkinson’s disease

    Neurology

    (2015)
  • C. Jia et al.

    Low-rank tensor subspace learning for RGB-D action recognition

    IEEE Trans. Image Process.

    (2016)
  • Cited by (37)

    View all citing articles on Scopus
    View full text