Abstract
One of the goals of medical research in the area of dementia is to correlate images of the brain with clinical tests. Our approach is to start with the images and explain the differences and commonalities in terms of the other variables. First, we cluster Positron emission tomography (PET) scans of patients to form groups sharing similar features in brain metabolism. To the best of our knowledge, it is the first time ever that clustering is applied to whole PET scans. Second, we explain the clusters by relating them to non-image variables. To do so, we employ RSD, an algorithm for relational subgroup discovery, with the cluster membership of patients as target variable. Our results enable interesting interpretations of differences in brain metabolism in terms of demographic and clinical variables. The approach was implemented and tested on an exceptionally large data collection of patients with different types of dementia. It comprises 10 GB of image data from 454 PET scans, and 42 variables from psychological and demographical data organized in 11 relations of a relational database. We believe that explaining medical images in terms of other variables (patient records, demographic information, etc.) is a challenging new and rewarding area for data mining research.
Similar content being viewed by others
References
Bär H, Sauer J (2006) Diagnostik und Therapie häufiger Demenzen. In: Ärzteblatt Thüringen. Jena, Germany, pp 413–414
Böhm C, Kailing K, Kriegel H-P, Kröger P (2004) Density connected clustering with local subspace preferences. In: ICDM ’04: Proceedings of the fourth IEEE international conference on data mining (ICDM’04). IEEE Computer Society, Washington, DC, pp 27–34
Chandler MJ, Lacritz LH, Hynan LS, Barnard HD, Allen G, Deschner M, Weiner MF, Cullum CM (2005) A total score for the CERAD neuropsychological battery. Neurology 65(1): 102–106
Chen JY, Shen CY, Sivachenko AY (2006) Mining Alzheimer disease relevant proteins from integrated protein interactome data. Pac Symp Biocomput 11: 367–378
Clark P, Niblett T (1989) The CN2 induction algorithm. Mach Learn 3(4): 261–283
Corani G, Edgar C, Marshall I, Wesnes K, Zaffalon M (2006) Classification of dementia types from cognitive profiles data. In: Proceedings of the 10th european conference on principle and practice of knowledge discovery in databases (PKDD 2006). Springer, Heidelberg, pp 470–477
Fung G, Stoeckel J (2006) SVM feature selection for classification of SPECT images of Alzheimer’s disease using spatial information. Knowl Inf Syst 11(2): 243–258
Kalbfleisch J (1985) Probability and statistical inference: statistical inference, vol 2. Springer, Heidelberg
Kaufman L, Rousseeuw PJ (1990) Finding groups in data: an introduction to cluster analysis. Wiley, New York
Lavrač N, Železný F, Flach PA (2003) RSD: Relational subgroup discovery through first-order feature construction. In: Matwin S, Sammut C (eds) Proceedings of the 12th international conference on inductive logic programming. Lecture Notes in Artificial Intelligence, vol 2583. Springer, Heidelberg, pp 149–165
Mani S, Shankle W, Pazzani MJ, Smyth P, Dick MB (1997) Differential diagnosis of dementia: a knowledge discovery and data mining (KDD) approach. American Medical Informatics Association (AMIA) Annual Fall Symposium
Megalooikonomou V, Ford J, Shen L, Makedon F, Saykin A (2000) Data mining in brain imaging. Stat Methods Med Res 9: 359–394
Ordonez C, Ezquerra N, Santana CA (2006) Constraining and summarizing association rules in medical data. Knowl Inf Syst 9(3): 259–289
Perneczky R, Drzezga A, Diehl-Schmid J, Schmid G, Wohlschlager A, Kars S, Grimmer T, Wagenpfeil S, Monsch A, Kurz A (2006) Schooling mediates brain reserve in Alzheimer’s disease: findings of FDG PET. J Neurol Neurosurg Psychiatry 77: 1060–1063
Sese J, Kurokawa Y, Monden M, Kato K, Morishita S (2004) Constrained clusters of gene expression profiles with pathological features. Bioinformatics 20(17): 3137–3145
Železný F (2003) RSD—a system for relational subgroup discovery through first-order feature construction—user‘s manual. 2003. v1.0
Železný F, Lavrač N (2006) Propositionalization-based relational subgroup discovery with RSD. Mach Learn 62(1-2): 33–63
Walker PR, Smith B, Liu QY, Famili F, Valdes J, Liu Z, Lach B (2004) Data mining of gene expression changes in alzheimer brain. Artif Intel Med 31(2): 137–154
World Health Organization (2005) ICD-10: International statistical classification of diseases and related health problems (Tenth Revision), 2nd edn. World Health Organization, Geneva, Switzerland
Wu X, Kumar V, Ross Quinlan J, Ghosh J, Yang Q, Motoda H, McLachlan GJ, Ng A, Liu B, Yu PS, Zhou Z-H, Steinbach M, Hand DJ, Steinberg D (2008) Top 10 algorithms in data mining. Knowl Inf Syst 14(1): 1–37
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Schmidt, J., Hapfelmeier, A., Mueller, M. et al. Interpreting PET scans by structured patient data: a data mining case study in dementia research. Knowl Inf Syst 24, 149–170 (2010). https://doi.org/10.1007/s10115-009-0234-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-009-0234-y