Abstract
When data sets are multilevel (group nesting or repeated measures), different sources of variations must be identified. In the framework of unsupervised analyses, multilevel simultaneous component analysis (MSCA) has recently been proposed as the most satisfactory option for analyzing multilevel data. MSCA estimates submodels for the different levels in data and thereby separates the “within”-subject and “between”-subject variations in the variables. Following the principles of MSCA and the strategy of decomposing the available data matrix into orthogonal blocks, and taking into account the between- and the within data structures, we generalize, in a multilevel perspective, multivariate models in which a matrix of response variables can be used to guide the projections (formed by responses predicted by explanatory variables or by a limited number of their combinations/composites) into choices of meaningful directions. To this end, the current paper proposes the multilevel version of the multivariate regression model and dimensionality-reduction methods (used to predict responses with fewer linear composites of explanatory variables). The principle findings of the study are that the minimization of the loss functions related to multivariate regression, principal-component regression, reduced-rank regression, and canonical-correlation regression are equivalent to the separate minimization of the sum of two separate loss functions corresponding to the between and within structures, under some constraints. The paper closes with a case study of an application focusing on the relationships between mental health severity and the intensity of care in the Lombardy region mental health system.
Similar content being viewed by others
References
Abraham B, Merola G (2005) Dimensionality reduction approach to multivariate prediction. Comput Stat Data Anal 48: 5–16
Anderson TW (1951) Estimating linear restrictions on regression coefficients for multivariate normal distributions. Ann Math Stat 22: 327–351
Anderson TW (2003) An introduction to multivariate statistical analysis. Wiley-Interscience, Hoboken, NJ
Bryk AS, Raudenbush SW (1992) Hierarchical linear models, applications and data analysis methods. Sage, Newbury Park, CA
Burnham AJ, Viveros R, MacGregor JF (1996) Frameworks for latent variable multivariate regression. J Chemom 10: 31–45
Davies PT, Tso MK (1982) Procedures for reduced-rank regression. Appl Stat 3: 244–255
de Jong S, Kiers HA (1992) Principal covariates regression: part I theory. Chemom Intell Lab Syst 14: 155–164
de Noord OE, Theobald EH (2005) Multilevel component analysis and multilevel PLS of chemical process data. J Chemom 19: 301–307
Erlicher A, Lora A (2002) Pattern di trattamento e costi dei dipartimenti di salute mentale della regione Lombardia. Progetto di ricerca HoNOS 2. Pensiero Scientifico Editore, Milan, Italy
Goldstein H, McDonald RP (1988) A general model for the analysis of multilevel data. Psychometrika 53: 455–468
Gower JC (1975) Generalized procrustes analysis. Psychometrika 40: 33–51
Hox J (2002) Multilevel analysis. Erlbaum, Mahwah, NJ
Hwang H, Takane Y (2004) Generalized structured component analysis. Psychometrika 69: 81–99
Hwang H, Takane Y, Malhotra N (2007) Multilevel generalized structured component analysis. Behaviormetrika 34: 95–109
Izenman AJ (1975) Reduced-rank regression for the multivariate bilinear model. J Multivar Anal 5: 248–264
Jansen JJ, Hoefsloot HCJ, van der Greef J, Timmerman ME, Smilde AK (2005) Multilevel component analysis of time-resolved metabolic fingerprinting data. Analytica Chimica Acta 530: 173–183
Jansen JJ, Hoefsloot HCJ, Westerhuis JA, van der Greef J, Timmerman ME, Smilde AK (2005) ASCA: analysis of multivariate data obtained from an experimental design. J Chemom 19: 469–481
Joreskog KG (1993) Testing structural equation models. In: Bollen KA, Long SJ (eds) Testing structural equation models. Sage Publications, London, pp 294–316
Kaiser HF (1958) The varimax criterion for analytic rotation in factor analysis. Psychometrika 23: 187–200
Lora A, Bai G, Bianchi S, Bolongaro G, Civenti G, Erlicher A et al (2001) La versione italiana della HoNOS (Health of the Nation Outcome Scales), una scala per la valutazione della gravità à e dell’esito nei servizi di salute mentale. Epidemiologia e Psichiatria Sociale 10: 198–212
Lovaglio PG, Monzani E (2012, in press) Health of the nation outcome scales evaluation in a community setting population. Qual Life Res. doi:10.1007/s11136-011-0071-9
Massey WF (1965) Principal components regression in exploratory statistical research. J Am Stat Assoc 60: 234–246
Meredith W, Millsap RE (1985) On component analysis. Psychometrika 50: 495–507
Merola GM, Abraham B (2001) Dimensionality reduction approach to multivariate prediction. Can J Stat 29: 191–200
Muthén BO (1991) Multilevel factor analysis of class and student achievement components. J Educ Meas 28: 338–354
Muthén BO (1994) Multilevel covariance structure analysis. Sociol Methods Res 22: 376–398
Neuhaus JM, Kalbfleisch JD (1998) Between- and within-cluster covariate effects in the analysis of clustered data. Biometrics 54: 638–645
Reise SP, Duan N (2003) Multilevel modelling. Methodological advances, issues, and applications. Erlbaum, Mahwah, NJ
Schönemann PH, Steiger JH (1976) Regression component analysis. Br J Math Stat Psychol 29: 175–189
Scott AJ, Holt D (1982) The effect of two-stage sampling on ordinary least-squares methods. J Am Stat Assoc 77: 848–854
Smilde AK, Jansen JJ, Hoefsloot HCJ, Lamers RJ, van der Greef J, Timmerman ME (2005) ANOVA simul- taneous component analysis (ASCA): a new tool for analyzing designed metabolomics data. Bioinformatics 21: 3043–3048
Snijders TAB, Bosker RJ (1999) Multilevel analysis: an introduction to basic and advanced multilevel modelling. Sage, London
Takane Y, Hunter MA (2001) Constrained principal component analysis: a comprehensive theory. Appl Algebra Eng Commun Comput 12: 391–419
Thissen U, Wopereis S, van den Berg SAA, Bobeldijk I, Kleemann R, Kooistra T et al (2009) Improving the analysis of designed studies by combining statistical modelling with study design information. BMC Bioinformatics 10: 52
Timmerman ME (2006) Multilevel component analysis. Br J Math Stat Psychol 59: 301–320
Timmerman ME, Kiers HAL, Smilde AK, Ceulemans E, Stouten J (2009) Bootstrap confidence intervals in multilevel simultaneous component analysis. Br J Math Stat Psychol 62: 299–318
Timmerman ME, Kiers HL (2003) Four simultaneous component models of multivariate time series from more than one subject to model intraindividual and interindividual differences. Psychometrika 86: 105–122
Van den Wollenberg R (1977) Redundancy analysis: an alternative for canonical correlation analysis. Psychometrica 42: 207–219
van der Burg E, de Leeuw J, Dijksterhuis G (1994) Nonlinear canonical correlation with k sets of variables. Comput Stat Data Anal 18: 141–163
Wing J, Curtis RH, Beevor AS, Park BG, Hadden S, Burns A (1998) Health of the nation outcome scales (HoNOS): research and development. Br J Psychiatry 172: 11–18
Wold H (1982) Soft modeling: the basic design and some extensions. In: Joreskog KG, Wold H (eds) System under indirect observation: causality, structure, prediction. North-Holland, Amsterdam, pp 1–54
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Lovaglio, P.G., Vittadini, G. Multilevel dimensionality-reduction methods. Stat Methods Appl 22, 183–207 (2013). https://doi.org/10.1007/s10260-012-0215-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10260-012-0215-2