skip to main content
10.1145/2487575.2487671acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
poster

FeaFiner: biomarker identification from medical data through feature generalization and selection

Published: 11 August 2013 Publication History

Abstract

Traditionally, feature construction and feature selection are two important but separate processes in data mining. However, many real world applications require an integrated approach for creating, refining and selecting features. To address this problem, we propose FeaFiner (short for Feature Refiner), an efficient formulation that simultaneously generalizes low-level features into higher level concepts and then selects relevant concepts based on the target variable. Specifically, we formulate a double sparsity optimization problem that identifies groups in the low-level features, generalizes higher level features using the groups and performs feature selection. Since in many clinical researches non- overlapping groups are preferred for better interpretability, we further improve the formulation to generalize features using mutually exclusive feature groups. The proposed formulation is challenging to solve due to the orthogonality constraints, non-convexity objective and non-smoothness penal- ties. We apply a recently developed augmented Lagrangian method to solve this formulation in which each subproblem is solved by a non-monotone spectral projected gradient method. Our numerical experiments show that this approach is computationally efficient and also capable of producing solutions of high quality. We also present a generalization bound showing the consistency and the asymptotic behavior of the learning process of our proposed formulation.
Finally, the proposed FeaFiner method is validated on Alzheimer's Disease Neuroimaging Initiative dataset, where low-level biomarkers are automatically generalized into robust higher level concepts which are then selected for predicting the disease status measured by Mini Mental State Examination and Alzheimer's Disease Assessment Scale cognitive subscore. Compared to existing predictive modeling methods, FeaFiner provides intuitive and robust feature concepts and competitive predictive accuracy.

References

[1]
F. Bach, R. Jenatton, J. Mairal, and G. Obozinski. Convex optimization with sparsity-inducing norms. Opt. for Mach. Learn., pages 19--53, 2011.
[2]
P. Bühlmann, P. Rütimann, S. van de Geer, and C. Zhang. Correlated variables in regression: clustering and sparse estimation. arXiv preprint arXiv:1209.5908, 2012.
[3]
P. Bühlmann and S. Van De Geer. Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, 2011.
[4]
K. P. Burnham and D. R. Anderson. Multimodel inference understanding aic and bic in model selection. Soc. Met. & Res., 33(2):261--304, 2004.
[5]
R. Cabeza. Hemispheric asymmetry reduction in older adults: the harold model. Psych. and ag., 17(1):85, 2002.
[6]
A. Dobra, C. Hans, B. Jones, J. R. Nevins, G. Yao, and M. West. Sparse graphical models for exploring gene expression data. J. of Multi. Ana., 90(1):196--212, 2004.
[7]
S. Duchesne, A. Caroli, C. Geroldi, D. L. Collins, and G. B. Frisoni. Relating one-year cognitive change in mild cognitive impairment to baseline mri features. Neuroimage, 47(4):1363--1370, 2009.
[8]
J. Friedman, T. Hastie, and R. Tibshirani. A note on the group lasso and a sparse group lasso. arXiv preprint arXiv:1001.0736, 2010.
[9]
B. Horwitz, C. L. Grady, N. Schlageter, R. Duara, and S. Rapoport. Intercorrelations of regional cerebral glucose metabolic rates in alzheimer's disease. Brain research, 407(2):294--306, 1987.
[10]
S. Huang, J. Li, L. Sun, J. Liu, T. Wu, K. Chen, A. Fleisher, E. Reiman, and J. Ye. Learning brain connectivity of alzheimer's disease from neuroimaging data. NIPS, 22:808--816, 2009.
[11]
C. R. Jack, M. A. Bernstein, N. C. Fox, P. Thompson, G. Alexander, D. Harvey, B. Borowski, P. J. Britson, J. L Whitwell, C. Ward, et al. The alzheimer's disease neuroimaging initiative (adni): Mri methods. J. of Mag. Res. Imag., 27(4):685--691, 2008.
[12]
L. Jacob, G. Obozinski, and J. Vert. Group lasso with overlap and graph lasso. In ICML, pages 433--440, 2009.
[13]
M. Kamboh, F. Demirci, X. Wang, R. Minster, M. Carrasquillo, V. Pankratz, S. Younkin, A. Saykin, G. Jun, C. Baldwin, et al. Genome-wide association study of alzheimer's disease. Trans. Psyc., 2(5):e117, 2012.
[14]
J. Liu, J. Chen, and J. Ye. Large-scale sparse logistic regression. In KDD, pages 547--556, 2009.
[15]
J. Liu, S. Ji, and J. Ye. Multi-task feature learning via efficient l 2, 1-norm minimization. In UAI, pages 339--348, 2009.
[16]
J. Liu, S. Ji, and J. Ye. SLEP: Sparse Learning with Efficient Projections. Arizona State University, 2009.
[17]
J. Liu and J. Ye. Efficient euclidean projections in linear time. In ICML, pages 657--664, 2009.
[18]
Z. Lu and Y. Zhang. An augmented lagrangian approach for sparse principal component analysis. Math. Prog., pages 1--45, 2011.
[19]
K. V. Mardia, J. T. Kent, and J. M. Bibby. Multivariate Analysis. Academic Press, 1979.
[20]
A. Maurer, M. Pontil, and B. Romera-Paredes. Sparse coding for multitask and transfer learning. In ICML, pages 343--351, 2013.
[21]
N. Meinshausen and P. Bühlmann. Stability selection. J. of the Roy. Stat. Soc.: Series B (Stat. Meth.), 72(4):417--473, 2010.
[22]
J. Moossy, G. S. Zubenko, A. J. Martinez, and G. R. Rao. Bilateral symmetry of morphologic lesions in alzheimer's disease. Arch. of Neuro., 45(3):251, 1988.
[23]
A. Nemirovsky and D. Yudin. Problem complexity and method efficiency in optimization. 1983.
[24]
Y. Nesterov. A method of solving a convex programming problem with convergence rate o (1/k2). In Soviet Mathematics Doklady, volume 27, pages 372--376, 1983.
[25]
J. Nocedal and S. Wright. Numerical optimization. Springer verlag, 1999.
[26]
A. Nordberg et al. Pet imaging of amyloid in alzheimer's disease. Lancet neurology, 3(9):519, 2004.
[27]
J. R. Petrella, R. E. Coleman, and P. M. Doraiswamy. Neuroimaging and early diagnosis of alzheimer disease: A look to the future1. Radiology, 226(2):315--336, 2003.
[28]
R. G. Steel, J. H. Torrie, and D. A. Dickey. Principles and procedures of statistics. Principles and procedures of statistics, 1960.
[29]
C. M. Stonnington, C. Chu, S. Klöppel, C. R. Jack Jr, J. Ashburner, R. S. Frackowiak, et al. Predicting clinical scores from magnetic resonance scans in alzheimer's disease. Neuroimage, 51(4):1405, 2010.
[30]
P. M. Thompson, K. M. Hayashi, G. I. de Zubicaray, A. L. Janke, S. E. Rose, J. Semple, M. S. Hong, D. H. Herman, D. Gravano, D. M. Doddrell, et al. Mapping hippocampal and ventricular change in alzheimer disease. Neuroimage, 22(4):1754--1766, 2004.
[31]
R. Tibshirani. Regression shrinkage and selection via the lasso. J. of the Roy. Stat. Soc. Series B (Meth.), pages 267--288, 1996.
[32]
R. Tibshirani, M. Saunders, S. Rosset, J. Zhu, and K. Knight. Sparsity and smoothness via the fused lasso. J. of the Roy. Stat. Soc.: Series B (Stat. Meth.), 67(1):91--108, 2004.
[33]
P. Tseng and S. Yun. A coordinate gradient descent method for nonsmooth separable minimization. Math. Prog., 117(1):387--423, 2009.
[34]
S. J. Wright, R. D. Nowak, and M. A. Figueiredo. Sparse reconstruction by separable approximation. Signal Proc., IEEE Trans. on, 57(7):2479--2493, 2009.
[35]
L. Yuan, Y. Wang, P. M. Thompson, V. A. Narayan, and J. Ye. Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data. NeuroImage, 61(3):622--632, 2012.
[36]
D. Zhang and D. Shen. Predicting future clinical changes of mci patients using longitudinal and multimodal biomarkers. PloS one, 7(3):e33182, 2012.
[37]
P. Zhao and B. Yu. On model selection consistency of lasso. JMLR, 7(2):2541, 2007.
[38]
J. Zhou, J. Liu, V. A. Narayan, and J. Ye. Modeling disease progression via fused sparse group lasso. In KDD, pages 1095--1103, 2012.
[39]
J. Zhou, J. Liu, V. A. Narayan, and J. Ye. Modeling disease progression via multi-task learning. NeuroImage, 78:233--248, 2013.
[40]
J. Zhou, J. Sun, Y. Liu, J. Hu, and J. Ye. Patient risk prediction model via top-k stability selection. In SDM, pages 55--63, 2013.
[41]
J. Zhou, L. Yuan, J. Liu, and J. Ye. A multi-task learning formulation for predicting disease progression. In KDD, pages 814--822, 2011.
[42]
H. Zou and T. Hastie. Regularization and variable selection via the elastic net. J. of the Roy. Stat. Soc. Series B (Meth.), 67:301--320, 2005.

Cited By

View all
  • (2024)Genetic Programming with Multi-Task Feature Selection for Alzheimer's Disease Diagnosis2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611973(1-8)Online publication date: 30-Jun-2024
  • (2024)Sparse Variable Selection on High Dimensional Heterogeneous Data With Tree Structured ResponsesIEEE Access10.1109/ACCESS.2024.338430912(50779-50791)Online publication date: 2024
  • (2021)Estimating Time to Progression of Chronic Obstructive Pulmonary Disease With ToleranceIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2020.299225925:1(175-180)Online publication date: Jan-2021
  • Show More Cited By

Index Terms

  1. FeaFiner: biomarker identification from medical data through feature generalization and selection

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '13: Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
    August 2013
    1534 pages
    ISBN:9781450321747
    DOI:10.1145/2487575
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 11 August 2013

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. augmented lagrangian
    2. biomarkers
    3. feature generalization
    4. feature selection
    5. sparse learning
    6. spectral gradient descent

    Qualifiers

    • Poster

    Conference

    KDD' 13
    Sponsor:

    Acceptance Rates

    KDD '13 Paper Acceptance Rate 125 of 726 submissions, 17%;
    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)4
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 28 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Genetic Programming with Multi-Task Feature Selection for Alzheimer's Disease Diagnosis2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611973(1-8)Online publication date: 30-Jun-2024
    • (2024)Sparse Variable Selection on High Dimensional Heterogeneous Data With Tree Structured ResponsesIEEE Access10.1109/ACCESS.2024.338430912(50779-50791)Online publication date: 2024
    • (2021)Estimating Time to Progression of Chronic Obstructive Pulmonary Disease With ToleranceIEEE Journal of Biomedical and Health Informatics10.1109/JBHI.2020.299225925:1(175-180)Online publication date: Jan-2021
    • (2020)Feature Selection Algorithms in Medical Data Classification: A Brief Survey and ExperimentationICDSMLA 201910.1007/978-981-15-1420-3_90(831-841)Online publication date: 19-May-2020
    • (2019)TITANProceedings of the 27th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems10.1145/3347146.3359381(329-338)Online publication date: 5-Nov-2019
    • (2019)Data Subset Selection With Imperfect Multiple LabelsIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2018.287547030:7(2212-2221)Online publication date: Jul-2019
    • (2018)ACM Notice of Article Removal: Deep Learning Based Medical Diagnosis System Using Multiple Data Sources - originally published in the ACM Digital Library on 29-Aug-2018Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics10.1145/3233547.3233730(699-706)Online publication date: 15-Aug-2018
    • (2018)Framework for integration of domain knowledge into logistic regressionProceedings of the 8th International Conference on Web Intelligence, Mining and Semantics10.1145/3227609.3227653(1-8)Online publication date: 25-Jun-2018
    • (2018)Distributed Data Vending on Blockchain2018 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData)10.1109/Cybermatics_2018.2018.00201(1100-1107)Online publication date: Jul-2018
    • (2017)Patient Subtyping via Time-Aware LSTM NetworksProceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/3097983.3097997(65-74)Online publication date: 13-Aug-2017
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media