Multiple instance learning for classification of dementia in brain MRI

https://doi.org/10.1016/j.media.2014.04.006Get rights and content

Highlights

  • Multiple instance learning technique is applied to classification of subjects with Alzheimer’s disease.

  • Graphs are built from images to exploit information about the inherent structure of images for classification.

  • Validation is carried out on different classification tasks, including CN versus AD and SMCI versus PMCI.

  • Comparisons with two state-of-the-art methods show the effectiveness of the proposed method.

  • The proposed method provides an alternative framework for the detection and prediction of neurodegenerative diseases.

Abstract

Machine learning techniques have been widely used to detect morphological abnormalities from structural brain magnetic resonance imaging data and to support the diagnosis of neurological diseases such as dementia. In this paper, we propose to use a multiple instance learning (MIL) method in an application for the detection of Alzheimer’s disease (AD) and its prodromal stage mild cognitive impairment (MCI). In our work, local intensity patches are extracted as features. However, not all the patches extracted from patients with dementia are equally affected by the disease and some of them may not be characteristic of morphology associated with the disease. Therefore, there is some ambiguity in assigning disease labels to these patches. The problem of the ambiguous training labels can be addressed by weakly supervised learning techniques such as MIL. A graph is built for each image to exploit the relationships among the patches and then to solve the MIL problem. The constructed graphs contain information about the appearances of patches and the relationships among them, which can reflect the inherent structures of images and aids the classification. Using the baseline MR images of 834 subjects from the ADNI study, the proposed method can achieve a classification accuracy of 89% between AD patients and healthy controls, and 70% between patients defined as stable MCI and progressive MCI in a leave-one-out cross validation. Compared with two state-of-the-art methods using the same dataset, the proposed method can achieve similar or improved results, providing an alternative framework for the detection and prediction of neurodegenerative diseases.

Introduction

The aetiology of Alzheimer’s disease (AD) is the most commonly responsible for clinical dementia worldwide. Its progression leads to a gradual decline of memory and cognitive functions. The prevalence of AD is predicted to quadruple in the next four decades (Brookmeyer et al., 2007). However, no drug or treatment has so far been reported to be able to stop the progress of AD and it remains difficult to predict whether individuals will develop AD. There is a critical need to develop biomarkers for the early diagnosis of AD and measuring the outcomes of clinical drug trials (Clark et al., 2007). Although there is currently no cure for AD, there are some medications that can delay the onset of some symptoms such as memory loss, confusion, and cognitive problems (Yiannopoulou and Papageorgiou, 2013). Diagnosing AD early would allow doctors to treat patients sooner, which can then limit the devastating physical, psychological impact on patients and their relatives and reduce the economic burden on society. Mild cognitive impairment (MCI) is an intermediate stage between normal cognition and clinical dementia. Individuals with MCI have been reported to progress to clinical dementia at a rate of 10–15% annually (Grundman et al., 2004). Research on identifying MCI individuals who will progress to clinical dementia has received increasing attention in recent years (Wolz et al., 2011, Coupé et al., 2012, Wee et al., 2012b, Liu et al., 2013, Gray et al., 2013).

Different imaging techniques, such as structural magnetic resonance imaging (MRI) (Wolz et al., 2011, Coupé et al., 2012, Liu et al., 2013), functional MRI (Pihlajamäki and Sperling, 2008, Wee et al., 2011), fluorodeoxyglucose positron emission tomography (FDG-PET) (Herholz et al., 2002, Gray et al., 2012) and diffusion tensor imaging (DTI) (Wee et al., 2012b, Keihaninejad et al., 2013), have been used to derive image-based biomarkers for AD. Studies have shown that the combination of biomarkers from different imaging modalities (MRI, FDG-PET, DTI, fMRI) can provide complementary information of AD pathology and thus improve the classification accuracy (Zhang et al., 2011, Hinrichs et al., 2011, Wee et al., 2012b, Gray et al., 2013). In comparison to DTI, fMRI or FDG-PET, structural MRI is the most standardized and the most widely available imaging modality in clinical practice. In addition, MRI examinations can provide an opportunity to track different clinical phases of AD (Jack et al., 2013). Therefore, we evaluated our method using structural MR images. However, multiple datasets could also be acquired from different imaging modalities for developing different biomarkers of AD.

Several types of features can be derived from structural MRI for classification, such as gray matter density maps (Cuingnet et al., 2011, Liu et al., 2012a), cortical thickness (Cho et al., 2012, Wee et al., 2012a, Eskildsen et al., 2013) as well as volume and shape measures (Gerardin et al., 2009, Wolz et al., 2010). The number of training images is typically small in comparison with the high dimensionality of the voxel-wise features. Therefore, a feature selection step is necessary to tackle the problem of overfitting. Feature selection has been shown to improve the classification accuracy, but it depends on the adopted approaches (Chu et al., 2011). To reduce the feature space and select the discriminative features, statistical approaches (Yoon et al., 2007, Chu et al., 2011, Wee et al., 2012a) or sparse regression methods (Ghosh and Chinnaiyan, 2005, Liu et al., 2012b) are often used. Another popular method is to segment the whole brain into multiple anatomical (Gray et al., 2012) or discriminative (Fan et al., 2007) regions and then extract regional features such as volume or shape measures for classification. It should be noted that the features extracted from neuroimaging data are not isolated and exhibit high correlations (Chu et al., 2011). Considering the relationships among these features, tree-guided sparse coding methods (Liu et al., 2012b) or re-sampling schemes using Elastic Net (Janousova et al., 2012) has been recently proposed. These approaches can select voxel-wise features in meaningful brain regions, which may be related to pathology.

The features derived from MRI can be extracted from very local regions or the whole brain. At the voxel level, intensities or gray matter densities can be directly used in classification (Cuingnet et al., 2011, Vounou et al., 2012). At the whole image level, similarities between images can be used to derive features (Wolz et al., 2012). However, the structural changes induced in the early stages of AD have been observed to occur in small local regions rather than isolated voxels or the whole brain (Hinrichs et al., 2009). Patches represent features at an intermediate scale between the voxel level and the image level, which can capture disease-induced changes in local regions. Recent approaches (Coupé et al., 2012, Liu et al., 2013) utilize local intensity patterns within patches to capture the local structural information for AD classification. In these approaches, patches from patients with AD are used as positive samples and patches from healthy subjects are regarded as negative samples for training. However, patches are relatively small regions in brain images and not all patches in the brain are characteristic of changes associated with pathology. For example, patches in close vicinity of the hippocampus are more likely to be affected by AD while patches in homogeneous regions may not be affected. This is illustrated in Fig. 1. In addition, different types of dementias have different aetiologies. This means that some patches may be affected by other aetiologies such as cerebrovascular disease rather than AD. Therefore, not all patches from patients necessarily represent positive training samples. This means that there is some ambiguity in assigning disease labels to the training patches extracted from patients. One solution to this problem is to use a weakly supervised method such as multiple instance learning (MIL) (Maron and Lozano-Pérez, 1998), which can learn classifiers from ambiguously labeled training data. Although MIL have been successfully applied to different applications in computer vision (Babenko et al., 2009) and recently in medical imaging (Bi and Liang, 2007, Xu et al., 2012), to the best of our knowledge, it has not been used in the context of classification of neurological diseases. In this paper, we propose to use MIL for the classification of AD and to address the problem of ambiguous patch labels. Specifically, each image is regarded as a bag; the patches extracted from the images are thus treated as inter-correlated instances in the bags. MIL is then used to learn a bag-level classifier to predict the bag labels of unseen images and therefore classify the subjects.

Most existing approaches utilize the intensity values of patches for classification. The relationships among patches are usually ignored since the patches are treated as independently and identically distributed. However, patches from the same subject are rarely independent and often exhibit shared information. This information across patches can convey information about the inherent structure of the images, which may be helpful for disease classification. In recent works, correlated features are extracted to exploit the relationships among patches (Liu et al., 2013) or ROIs (Wee et al., 2012a) of the same subject, which has been shown to improve the classification accuracy. In our work, a graph is constructed from each image in order to investigate the relationships among patches and to exploit the inherent structural information of each image. After that, a graph kernel, which utilizes both the intensity values and the relationships of the extracted patches, is used to distinguish the positive and negative bags. Finally, a bag-level classifier is trained via a kernel machine.

A preliminary version of the presented framework has been published as a conference paper (Tong et al., 2013). The major difference in this work is that we adopted a more robust feature selection method as proposed in Janousova et al. (2012). In addition, an extended evaluation on the whole brain is presented and more detailed comparisons with state-of-the-art methods are also provided. The remainder of this paper is organized as follows: The demographic information of the image dataset in preparation of this article is introduced in Section 2.1. This is followed by a description of the preprocessing pipeline of these images in Section 2.3 and a description on how patches are extracted from the images to form corresponding bags in Section 2.4. We will then introduce the methodology of MIL and how we apply it to the classification of AD in Section 2.5. Performance of the proposed method has been evaluated using 834 subjects from the ADNI study. In Section 3, the influence of different parameters are studied and the performance of the proposed method is also compared with state-of-the-art techniques. The strengths and weaknesses of the proposed method are analyzed in the discussion section and finally we conclude the paper in Section 5.

Section snippets

Subjects

Data used in the preparation of this article were obtained from the ADNI database (adni.loni.ucla.edu). The ADNI was launched in 2003 by the National Institute on Aging (NIA), the National Institute of Biomedical Imaging and Bioengineering (NIBIB), the Food and Drug Administration (FDA), private pharmaceutical companies and non-profit organizations, as a $60 million, 5-year public–private partnership. The primary goal of ADNI has been to test whether serial MRI, PET, other biological markers,

Experiments and results

The performance of the proposed mi-Graph was evaluated on different classification tasks, including CN vs AD, CN vs PMCI and SMCI vs PMCI. Experiments were performed using leave-one-out cross validation since this validation is known to be an almost unbiased estimator (Cawley and Talbot, 2004). For a fair comparison with the study in Wolz et al. (2011), we also utilized a leave 5% out cross validation as adopted in their work. There are five important parameters in our proposed method: the size

Discussion

In this paper, we have developed a patch-based approach for the classification of subjects with disease such as AD. Since patches that are extracted from images of patients with AD may not be affected by AD or affected by other types of diseases (i.e. cerebrovascular disease), there is some ambiguity in assigning disease labels to these patches. We proposed to use MIL to address the problem of ambiguous labels of the training patches. The intensities of the patches and the relationships among

Conclusion

In this study, we have shown that the multiple instance learning technique can be successfully applied to the classification of AD. The proposed method was evaluated on a large database using the entire 834 baseline MR scans in the ADNI study. The direct comparisons with two recent methods demonstrate the effectiveness of the proposed method. In future work, we plan to extend the proposed framework using longitudinal datasets and other imaging modalities, such as FDG-PET images.

Acknowledgments

This project was partially funded by the China Scholarship Council. The ADNI Data collection and sharing for this project was funded by the Alzheimer’s Disease Neuroimaging Initiative (ADNI; Principal Investigator: Michael Weiner; NIH Grant U01 AG024904). ADNI is funded by the National Institute on Aging, the National Institute of Biomedical Imaging and Bioengineering (NIBIB), and through generous contributions from the following: Pfizer Inc., Wyeth Research, Bristol-Myers Squibb, Eli Lilly and

References (63)

  • C. Hinrichs et al.

    Spatially augmented lpboosting for AD classification with evaluations on the ADNI dataset

    Neuroimage

    (2009)
  • C. Hinrichs et al.

    Predictive markers for AD in a multi-modality framework: an analysis of MCI progression in the ADNI population

    Neuroimage

    (2011)
  • S. Keihaninejad et al.

    An unbiased longitudinal analysis framework for tracking white matter changes using diffusion tensor imaging with application to Alzheimer’s disease

    NeuroImage

    (2013)
  • J. Koikkalainen et al.

    Multi-template tensor-based morphometry: application to analysis of Alzheimer’s disease

    NeuroImage

    (2011)
  • J.P. Lerch et al.

    Cortical thickness analysis examined through power analysis and a population simulation

    Neuroimage

    (2005)
  • K.K. Leung et al.

    Brain MAPS: an automated, accurate and robust brain extraction technique using a template library

    Neuroimage

    (2011)
  • M. Liu et al.

    Ensemble sparse classification of Alzheimer’s disease

    Neuroimage

    (2012)
  • J. Lötjönen et al.

    Fast and robust extraction of hippocampus from MR images for diagnostics of Alzheimer’s disease

    Neuroimage

    (2011)
  • N.A. Ranginwala et al.

    Clinical criteria for the diagnosis of Alzheimer disease: still good after all these years

    Am. J. Geriatric Psych

    (2008)
  • M. Vounou et al.

    Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer’s disease

    Neuroimage

    (2012)
  • C.-Y. Wee et al.

    Enriched white matter connectivity networks for accurate identification of MCI patients

    Neuroimage

    (2011)
  • C.-Y. Wee et al.

    Identification of MCI individuals using structural and functional connectivity networks

    Neuroimage

    (2012)
  • R. Wolz et al.

    Measurement of hippocampal atrophy using 4D graph-cut segmentation: application to ADNI

    NeuroImage

    (2010)
  • R. Wolz et al.

    Nonlinear dimensionality reduction combining MR imaging with non-imaging information

    Medical Image Anal.

    (2012)
  • B.T. Wyman et al.

    Standardization of analysis sets for reporting results from ADNI MRI data

    Alzheimer’s Dementia

    (2013)
  • U. Yoon et al.

    Pattern classification using principal components of cortical thickness and its discriminative pattern in schizophrenia

    Neuroimage

    (2007)
  • D. Zhang et al.

    Multimodal classification of Alzheimer’s disease and mild cognitive impairment

    Neuroimage

    (2011)
  • S. Andrews et al.

    Support vector machines for multiple-instance learning

    Adv. Neural Inf. Process. Syst.

    (2002)
  • Babenko, B., Yang, M.-H., Belongie, S., 2009. Visual tracking with online multiple instance learning. In: IEEE...
  • Bi, J., Liang, J., 2007. Multiple instance learning of pulmonary embolism detection with geodesic distance along...
  • G.C. Cawley et al.

    Fast exact leave-one-out cross-validation of sparse least-squares support vector machines

    Neural Networks

    (2004)
  • Cited by (180)

    View all citing articles on Scopus
    1

    Data used in the preparation of this article were obtained from the ADNI database (http://www.loni.ucla.edu/ADNI). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: www.loni.ucla.edu/ADNI/Collaboration/ADNI_Authorship_list.pdf.

    View full text