Abstract
Mining big data in brain imaging genetics is an emerging topic in brain science. It can uncover meaningful associations between genetic variations and brain structures and functions. Sparse canonical correlation analysis (SCCA) is introduced to discover bi-multivariate correlations with feature selection. However, these SCCA methods cannot be directly applied to big brain imaging genetics data due to two limitations. First, they have cubic complexity in the size of the matrix involved and are computational and memory intensive when the matrix becomes large. Second, the parameters in an SCCA method need to be fine-tuned in advance. This further dramatically increases the computational time, and gets severe in high-dimensional scenarios. In this paper, we propose two fast and efficient algorithms to speed up the structure-aware SCCA (S2CCA) implementations without modification to the original SCCA models. The fast algorithms employ a divide-and-conquer strategy and are easy to implement. The experimental results, compared with conventional algorithms, show that our algorithms reduce the time usage significantly. Specifically, the fast algorithms improve the computational efficiency by tens to hundreds of times compared to conventional algorithms. Besides, our algorithms yield similar correlation coefficients and canonical loading profiles to the conventional implementations. Our fast algorithms can be easily parallelized to further reduce the computational time. This indicates that the proposed fast scalable SCCA algorithms can be a powerful tool for big data analysis in brain imaging genetics.
L. Du—This work was supported by NSFC (61602384), the Natural Science Basic Research Plan in Shaanxi Province of China (2017JQ6001), the China Postdoctoral Science Foundation (2017M613214), and the Fundamental Research Funds for the Central Universities (3102016OQD0065). This work was also supported by NIH R01 EB022574, R01 LM011360, U01 AG024904, P30 AG10133, R01 AG19771, UL1 TR001108, R01 AG 042437, R01 AG046171, and R01 AG040770, by DoD W81XWH-14-2-0151, W81XWH-13-1-0259, W81XWH-12-2-0012, and NCAA 14132004.
Data used in preparation of this article were obtained from the Alzheimer’s Disease Neuroimaging Initiative (ADNI) database (adni.loni.usc.edu). As such, the investigators within the ADNI contributed to the design and implementation of ADNI and/or provided data but did not participate in analysis or writing of this report. A complete listing of ADNI investigators can be found at: http://adni.loni.usc.edu/wp-content/uploads/how_to_apply/ADNI_Acknowledgement_List.pdf.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Chen, J., Bushman, F.D., et al.: Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. Biostatistics 14(2), 244–258 (2013)
Chen, X., Liu, H., Carbonell, J.G.: Structured sparse canonical correlation analysis. In: AISTATS (2012)
Du, L., Huang, H., Yan, J., Kim, S., Risacher, S.L., et al.: Structured sparse canonical correlation analysis for brain imaging genetics: an improved graphnet method. Bioinformatics 32(10), 1544–1551 (2016)
Du, L., et al.: A novel structure-aware sparse learning algorithm for brain imaging genetics. In: Golland, P., Hata, N., Barillot, C., Hornegger, J., Howe, R. (eds.) MICCAI 2014. LNCS, vol. 8675, pp. 329–336. Springer, Cham (2014). doi:10.1007/978-3-319-10443-0_42
Du, L., et al.: Identifying associations between brain imaging phenotypes and genetic factors via a novel structured SCCA approach. In: Niethammer, M., Styner, M., Aylward, S., Zhu, H., Oguz, I., Yap, P.-T., Shen, D. (eds.) IPMI 2017. LNCS, vol. 10265, pp. 543–555. Springer, Cham (2017). doi:10.1007/978-3-319-59050-9_43
Gorski, J., Pfeuffer, F., Klamroth, K.: Biconvex sets and optimization with biconvex functions: a survey and extensions. Math. Methods Oper. Res. 66(3), 373–407 (2007)
Jagust, W.J., Bandy, D., Chen, K., Foster, N.L., Landau, S.M., Mathis, C.A., Price, J.C., Reiman, E.M., Skovronsky, D., Koeppe, R.A., et al.: The Alzheimer’s disease neuroimaging initiative positron emission tomography core. Alzheimer’s Dement. 6(3), 221–229 (2010)
Lambert, J.C., Ibrahim-Verbaas, C.A., Harold, D., Naj, A.C., Sims, R., Bellenguez, C., Jun, G., DeStefano, A.L., Bis, J.C., Beecham, G.W., et al.: Meta-analysis of 74,046 individuals identifies 11 new susceptibility loci for Alzheimer’s disease. Nat. Genet. 45(12), 1452–1458 (2013)
Parkhomenko, E., Tritchler, D., Beyene, J.: Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8(1), 1–34 (2009)
Rosenfeld, J.A., Mason, C.E., Smith, T.M.: Limitations of the human reference genome for personalized genomics. PLoS ONE 7(7), e40294 (2012)
Saykin, A.J., Shen, L., Yao, X., Kim, S., Nho, K., et al.: Genetic studies of quantitative MCI and ad phenotypes in ADNI: progress, opportunities, and plans. Alzheimer’s Dement. 11(7), 792–814 (2015)
Shen, L., Kim, S., Risacher, S.L., Nho, K., Swaminathan, S., West, J.D., Foroud, T., Pankratz, N., Moore, J.H., Sloan, C.D., et al.: Whole genome association study of brain-wide imaging phenotypes for identifying quantitative trait loci in MCI and AD: a study of the ADNI cohort. Neuroimage 53(3), 1051–1063 (2010)
Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B (Stat. Methodol.) 68(1), 49–67 (2006)
Author information
Authors and Affiliations
Consortia
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Huang, Y. et al. (2017). A Fast SCCA Algorithm for Big Data Analysis in Brain Imaging Genetics. In: Cardoso, M., et al. Graphs in Biomedical Image Analysis, Computational Anatomy and Imaging Genetics. GRAIL MICGen MFCA 2017 2017 2017. Lecture Notes in Computer Science(), vol 10551. Springer, Cham. https://doi.org/10.1007/978-3-319-67675-3_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-67675-3_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67674-6
Online ISBN: 978-3-319-67675-3
eBook Packages: Computer ScienceComputer Science (R0)