Abstract
Generalized canonical correlation analysis (GCANO) is a versatile technique that allows the joint analysis of several sets of data matrices through data reduction. The method embraces a number of representative techniques of multivariate data analysis as special cases. The GCANO solution can be obtained noniteratively through an eigenequation and distributional assumptions are not required. The high computational and memory requirements of ordinary eigendecomposition makes its application impractical on massive or sequential data sets. The aim of the present contribution is twofold: (a) to extend the family of GCANO techniques to a split-apply-combine framework, that leads to an exact implementation; (b) to allow for incremental updates of existing solutions, which lead to approximate yet highly accurate solutions. For this purpose, an incremental SVD approach with desirable properties is revised and embedded in the context of GCANO, and extends its applicability to modern big data problems and data streams.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Baglama, J., & Reichel, L. (2007). Augmented implicitly restarted Lanczos bidiagonalization methods. SIAM Journal on Scientific Computing, 27, 19–42.
Baker, C., Gallivan, K., & Van Dooren, P. (2012). Low-rank incremental methods for computing dominant singular subspaces. Linear Algebra and its Applications 436(8), 2866–2888.
Bijmolt, T. H., & Van de Velden, M. (2012). Multiattribute perceptual mapping with idiosyncratic brand and attribute sets. Marketing Letters, 23(3), 585–601.
Carroll, J. D. (1968). A generalization of canonical correlation analysis to three or more sets of variables. In Proceedings of the 76th Annual Convention of the American Psychological Association (pp. 227–228).
Correa, N. M., Eichele, T., Adali, T., Li, Y., & Calhoun, V. D. (2010). Multi-set canonical correlation analysis for the fusion of concurrent single trial ERP and functional MRI. Neuroimage, 50, 1438–1445.
Gentry, J. (2011). twitteR: R based twitter client, http://cran.r-project.org/web/packages/twitteR/
Gifi, A. (1990). Nonlinear multivariate analysis. New York: Wiley.
Golub, G., & Van Loan, A. (1996). Matrix computations. Baltimore: John Hopkins University Press.
Herbster, M., & Warmuth, M. K. (2001). Tracking the best linear predictor. Journal of Machine Learning Research, 1, 281–309.
Iodice D’ enza, A., & Markos, A. (2015). Low-dimensional tracking of association structures in categorical data. Statistics and Computing, 25(5), 1009–1022.
Kroonenberg, P. M. (2008). Applied multiway data analysis. New York: Wiley.
Ross, D., Lim, J., Lin, R. S., & Yang, M. H. (2008). Incremental learning for robust visual tracking. International Journal of Computer Vision, 77, 125–141.
Takane, Y., Hwang, H., & Abdi, H. (2008). Regularized multiple-set canonical correlation analysis. Psychometrika, 73(4), 753–775.
Tipping, M. E., & Bishop, C. M. (1999). Probabilistic principal component analysis. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 61(3), 611–622.
Van de Velden, M., & Takane, Y. (2012). Generalized canonical correlation analysis with missing values. Computational Statistics, 27(3), 551–571.
Van der Burg, E. (1988). Nonlinear canonical correlation and some related techniques. Leiden: DSWO Press.
Wickam, H. (2011). A split-apply-combine strategy for data analysis. Journal of Statistical Software 11(1), 1–29.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Markos, A., D’Enza, A.I. (2016). Incremental Generalized Canonical Correlation Analysis. In: Wilhelm, A., Kestler, H. (eds) Analysis of Large and Complex Data. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-25226-1_16
Download citation
DOI: https://doi.org/10.1007/978-3-319-25226-1_16
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25224-7
Online ISBN: 978-3-319-25226-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)