Elsevier

NeuroImage

Volume 152, 15 May 2017, Pages 299-311
NeuroImage

Voxel-based logistic analysis of PPMI control and Parkinson's disease DaTscans

https://doi.org/10.1016/j.neuroimage.2017.02.067Get rights and content

Highlights

  • Voxel-wise analysis of PPMI Control and Parkinson's DaTscan images gives accurate classification and identifies voxels that are informative.

  • New analysis called Logistic Principal Components reveals sources of variation in control and PD images that affect classification.

  • Logistic features are related to MDS-UPDRS scores.

  • Logistic features interact with sex and age, but not with handedness.

Abstract

A comprehensive analysis of the Parkinson's Progression Markers Initiative (PPMI) Dopamine Transporter Single Photon Emission Computed Tomography (DaTscan) images is carried out using a voxel-based logistic lasso model. The model reveals that sub-regional voxels in the caudate, the putamen, as well as in the globus pallidus are informative for classifying images into control and PD classes. Further, a new technique called logistic component analysis is developed. This technique reveals that intra-population differences in dopamine transporter concentration and imperfect normalization are significant factors influencing logistic analysis. The interactions with handedness, sex, and age are also evaluated.

Introduction

Dopamine transporter imaging by [123I]-FP-CIT SPECT (also known as DaTscan) is used to diagnose Parkinson's disease (PD) and to distinguish it from other movement disorders, such as essential tremor (Benamer et al., 2000). In the clinic, most DaTscans are usually interpreted visually by experts, but automated quantitative analysis is likely to improve the interpretation. The European Association of Nuclear Medicine Neuroimaging Committee recommends quantitative analysis in addition to visual analysis (Darcourt et al., 2010). Because PD primarily affects dopaminergic neurons, most previous quantitative analysis of DaTscans focused on the striatum. We too focus on the striatum, but also include the globus pallidus and the thalamus in the analysis. The motivation for including these extra-striatal structures is discussed below in detail.

The Parkinson's Progression Markers Initiative (PPMI) is a study that offers an unprecedented number of DaTscans for analysis (www.ppmi.org). As of April 2016, over 600 subjects (control+PD) have been scanned, and after reconstruction and registration to a common space, their images are available for download and analysis. This large amount of data opens the door to using machine learning techniques to classify control and PD images.

Much of the previous work on machine learning/automated analysis of DaTscans is region-based, e.g. Prashanth et al., 2014, Zubal et al., 2007. The regional striatal binding ratios are calculated for the right and left putamen and the right and left caudate, and these four numbers are used in all subsequent analysis. Region-based analysis is usually justified on the grounds that there is dopamine loss in the putamen relative to the caudate in PD. However, from a statistical point of view, there are limitations to region-based analysis. First, it is unclear whether a single number calculated from a region is statistically optimal for analysis or for classification. Certain voxels within the region may have higher dopamine transporter loss than others and hence may be more informative. Second, it is not clear why only the putamen and the caudate should enter into quantification. Extra-striatal regions can contain significant amount of dopamine, and using these regions can improve the statistical reliability of the method. This could happen, for example, if the extra-striatal regions are pooled with the caudate to provide a reference to compare the putamen with. Pooling data improves statistical reliability. Dopaminergic neurons are known to densely occur in the primate thalamus (Sánchez-González et al., 2005), and the thalamus was added to our analysis for that reason. Extra-striatal structures are also known to be involved in PD. The globus pallidus is known to be involved in PD subtypes (Rajput et al., 2008), and was included in our analysis for that reason. Including all voxels from these regions in the data and letting an algorithm decide which voxels are most informative is likely to be statistically more meaningful than assuming a priori which voxels are important. Such an algorithm is voxel-based rather than region based.

We use the logistic lasso (Tibshirani, 1996) as the machine learning method for classification. The logistic lasso is voxel-based. It works by using a linear combination of a sparse set of voxels to calculate the probabilities of belonging to the control and PD classes. Voxels in the sparse set are chosen purely based on training data. Below, we use the informal term “informative voxels” to denote those voxels that are statistically useful for classification. Informative voxels are likely to be a subset of all voxels that are analyzed, and possibly also a subset of all voxels affected by PD.

Understanding the heterogeneity of the PPMI data set is also important because heterogeneity in the DaT signal can be confounding; this can happen, for example, in a clinical trial where response to dopaminergic therapies is measured. We need a greater understanding of how the image features that distinguish controls from PDs vary within each population. To achieve this, we introduce the concept of logistic principal components (LPC). LPCs are particularly illuminating for the PPMI data, as we show in the Results section. We also investigate the interaction of the discriminatory image feature with handedness, sex, and age and establish the significance of the interaction with p-values.

To our knowledge, such a comprehensive analysis of PPMI DaTscans at the voxel level has not yet been carried out.

Imaging holds considerable promise in evaluating pre-motor PD, assessing disease progression, and in differential diagnosis. Excellent reviews are available in Booij and Knol, 2007, Tatsch and Poepperl, 2013. While our paper is focused on analyzing DaTscan images, techniques for analysis of other SPECT methods have also been developed; one example is the IBZM tool (Buchert et al., 2006).

Machine learning/automated classification of DaTscan images has been applied to non-PPMI data (Illan et al., 2012, Koch et al., 2005, Morton et al., 2005, Segovia et al., 2012, Tossici-Bolt et al., 2006, Toweya et al., 2011) as well as to PPMI data (Kuo et al., 2013, Kuo et al., 2014, Oliviera and Castelo-Branco, 2015, Prashanth et al., 2014, Zubal et al., 2007). Pioneering studies of automated classification of PPMI images were carried by Zubal, Kuo and co-workers (Kuo et al., 2013, Kuo et al., 2014, Zubal et al., 2007) starting in 2007. In their technique, each three-dimensional DaTscan is projected onto a two-dimensional plane by summing voxels along the vertical dimension. A rudimentary striatal “atlas” containing the caudate, the putamen, and the occipital lobe is placed and adjusted on the two-dimensional image. The mean striatal binding ratios in the left and right caudate nuclei and putameni are calculated. In Zubal et al. (2007), these ratios are compared with corresponding ratios calculated from manual tracings, validating the automated placement of the atlas. In Kuo et al. (2013), the smaller of the left and right striatal binding ratios are used in an ROC analysis for classification. In Kuo et al. (2014), the difference between left and right striatal binding ratios is used as a laterality measure and compared with clinical symptoms and visual reads. Similar atlas or template-based approaches using non-PPMI data are Koch et al., 2005, Morton et al., 2005, Tossici-Bolt et al., 2006.

Regional-level support-vector and logistic analysis of the PPMI images was carried out by Prashanth et al. (2014) using the mean striatal binding ratios in the left and right caudate and putamen. Interestingly, the authors find that an interaction term (a product of the binding ratios of the two caudate nuclei) is necessary for accurate logistic classification.

One exception to the region-based analysis, is the voxel-based analysis of PPMI images carried out by Oliviera and Castelo-Branco (2015) using a support-vector machine. Our approach is similar in spirit to this approach, but differs from it in several important aspects: First, support-vector machines provide a binary output while the logistic model provides a probability of classification, which is more nuanced. Second, the support-vector machine used in Oliviera and Castelo-Branco (2015) cannot identify informative voxels, while the logistic model can. The authors of Oliviera and Castelo-Branco (2015) use a post processing step using a voxel-wise z-score to identify voxels in a PD image that differ from corresponding voxels in the control images. This identifies only the most strongly differing voxels; it does not does not identify all voxels that contribute to the classification. In contrast, the logistic lasso model explicitly identifies informative voxels, and only uses the identified voxels for classification. Third, an extension of the logistic formulation that we propose provides a mechanism (LPCs) to understand the source of variation in the data as it pertains to classification. No such formulation is available for support-vector machines. And finally, the logistic model provides a simple mechanism to understand interactions with age, gender, etc. These analyses give significant additional insight into the data.

Machine learning has also been applied successfully to classify controls from PD subjects using non-DaTscan information. An excellent survey of machine learning approaches for PD using voice recordings, MR images, gait patterns etc. can be found in Bind et al. (2015).

Section snippets

PPMI images

As of April 2016, DaTscans from 658 subjects (210 controls + 448 PD) were available from PPMI and were downloaded for this study. The PD subjects had multiple longitudinal scans, and only the first of these longitudinal scans was used. Controls do not have longitudinal scans; they are scanned only once.

The imaging protocol for the PPMI scans is documented in http://www.ppmi-info.org/wp-content/uploads/2013/02/PPMI-Protocol-AM5-Final-27Nov2012v6-2.pdf. All scans are co-registered and resampled

Applying the logistic lasso

The logistic lasso model was applied to the preprocessed PPMI DaTscan images in four different ways: First, all of the images were used as training images, with 10-fold cross validation to determine the λ parameter. Then the ADMM algorithm was used to fit the logistic lasso using the cross validated λ to all images. We refer to this as the all-data case. Next, the images were divided into three equal sized groups, each group containing the same fraction of control and PD images as the original

Discussion

We now turn to discussing implications of the above results, starting with the logistic lasso results of Table 2, Table 3 and the visualization of logistic lasso coefficients in Fig. 5. But, a word of caution before we proceed. The logistic lasso model is focused on calculating classification probabilities. All of the above results, as well as the discussion below, should be interpreted only in the context of classification. The results and the discussion do not bear a more general

Conclusion

In conclusion, 3d voxel-wise logistic analysis of the PPMI control and PD population provides accurate classification. The analysis shows that sub-regional voxels in the caudate, the globus pallidus, and the putamen are informative for classification.

Logistic principal component analysis reveals two uncorrelated sources which explain most of the variance of the logistic feature. Finally, there are significant interactions of the logistic feature with sex (for controls) and with age, but not

Acknowledgements

Data used in the preparation of this article were obtained from the Parkinson's Progression Markers Initiative (PPMI) database (www.ppmi-info.org/data. For up-to-date information on the study, visit www.ppmi-info.org.

PPMI – a public-private partnership – is funded by the Michael J. Fox Foundation for Parkinson's Research and multiple funding partners. The full list of PPMI funding partners can be found at ppmi-info.org/fundingpartners.

We would like to add that the research presented in this

References (35)

  • C.J. Holmes et al.

    Enhancement of MR images using registration for signal averaging

    J. Comput. Assist. Tomogr.

    (1998)
  • ...
  • ...
  • I.A. Illan et al.

    Automatic assistance to Parkinson's disease diagnosis in DaTSCAN SPECT imaging

    Med. Phys.

    (2012)
  • R.B. Innis et al.

    Consensus nomenclature for in vivo imaging of reversibly binding radioligands

    J. Cereb. Blood Flow Metab.

    (2007)
  • W. Koch et al.

    Clinical testing of an optimized software solution for an sutomated, observer-independent evaluation of dopamine transporter SPECT studies

    J. Nucl. Med.

    (2005)
  • P.H. Kuo et al.

    Receiver-operating-characteristic analysis of an automated program for analyzing striatal uptake of 123I-Ioflupane SPECT images: calibration using visual reads

    J. Nucl. Med. Technol.

    (2013)
  • Cited by (25)

    • Classification of PPMI MRI scans with voxel-based morphometry and machine learning to assist in the diagnosis of Parkinson's disease

      2021, Computer Methods and Programs in Biomedicine
      Citation Excerpt :

      Another motivation is the application of machine learning and the automated classification of PPMI MRI data since there have been few efforts for classification of MR images for PD detection. There have been successful studies where Non-MRI images were automatically analyzed to assess PD disease progression [15–17]. Classification of PD patients and control subjects has been done by using a Support Vector Machine (SVM) fed with features extracted from resting-state functional MRI (rsfMRI) images [18]; however, the used dataset is considerably smaller (nineteen PD patients and twenty-seven healthy subjects) than the PPMI MRI dataset, experiments were not conducted in separate populations (men and women), and their performance is lower than the one from the proposed method (accuracy of 86.96%, sensitivity of 78.95%, specificity of 92.59%).

    • Performance analysis of different classification algorithms using different feature selection methods on Parkinson's disease detection

      2018, Journal of Neuroscience Methods
      Citation Excerpt :

      It also aims to determine a discriminative feature subset with high performance by using training data in each fold of classification algorithm (Beheshti and Demirel, 2016). In order to classify the data with the selected features, five different classification algorithms which are k nearest neighbor classifier (kNN) (Tagare et al., 2017), naive Bayes (NB) (Jain and Singh, 2018), ensemble-subspace discriminant (ESD) (Roffo, 2016), ensemble-bagged trees (EBT) (Roffo, 2016), and support vector machines (SVM) (Chang et al., 2011) have been studied. The performances of all five feature ranking methods with all five classification algorithms are compared.

    View all citing articles on Scopus
    View full text