Abstract
Compositional Data Analysis can be useful for unveiling relative variability patterns among variables describing the parts of a phenomenon. Compositions are often represented as orthonormal balances associated with a sequential binary partition (SBP). Principal balances analysis (PBA) is a tool used to find a meaningful SBP by subsequently maximizing explained variability. The exact estimation of PBA is prohibitive for large datasets; therefore, algorithms providing an acceptable approximation are used instead. For compositional data of third-order, such exploratory search must account for third-mode variability. To this end, this work introduces a three-way adaptation of PBA in which estimation is carried out by Tucker3. A study on the composition of academic recruitment fields by Italian macro-region and gender/role is carried out to illustrate the merits of this procedure.




Similar content being viewed by others
References
Aitchison, J. (1982). The statistical analysis of compositional data. Journal of the Royal Statistical Society: Series B (Methodological), 44(2), 139–160.
Aitchison, J. (1986). The statistical analysis of compositional data. Chapman & Hall Ltd.
Aitchison, J. (1994). Principles of compositional data analysis. Multivariate analysis and its applications. In T. W. Anderson & I. Olkin (Eds.), IMS Lectures notes-monograph series (Vol. 24, pp. 73–81). Institute of Mathematical Statistics.
Aitchison, J. (2005). A concise guide to compositional data analysis. In 2nd Compositional data analysis workshop. Girona, Spain. Retrieved from http://ima.udg.edu/Activitats/CoDaWork05/.
Aitchison, J., & Greenacre, M. (2002). Biplots of compositional data. Journal of the Royal Statistical Society: Series C (Applied Statistics), 51(4), 375–392.
Bergeron-Boucher, M. P., Simonacci, V., Oeppen, J., & Gallo, M. (2018). Coherent modeling and forecasting of mortality patterns for subpopulations using multiway analysis of compositions: An application to Canadian provinces and territories. North American Actuarial Journal, 22(1), 92–118.
Billheimer, D., Guttorp, P., & Fagan, W. F. (2001). Statistical interpretation of species composition. Journal of the American statistical Association, 96(456), 1205–1214.
Carroll, J. D., & Chang, J. J. (1970). Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young’’ decomposition. Psychometrika, 35(3), 283–319.
Cattell, R. B. (1944). Parallel proportional profiles and other principles for determining the choice of factors by rotation. Psychometrika, 9(4), 267–283.
Comas-Cufí, M. (2020). coda.base: A basic set of functions for compositional data analysis [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=coda.base (R package version 0.3.1).
Di Palma, M. A., & Gallo, M. (2019). External information model in a compositional perspective: Evaluation of Campania adolescents’ preferences in the allocation of leisure-time. Social Indicators Research, 146(1–2), 117–133.
Egozcue, J. J., Barceló-Vidal, C., Martín-Fernández, J., Jarauta-Bragulat, E., Díaz-Barrero, J., & Mateu-Figueras, G. (2011). Elements of simplicial linear algebra and geometry. In V. Pawlowsky-Glahn & A. Buccianti (Eds.), Compositional data analysis: Theory and applications. John Wiley & Sons. https://doi.org/10.1002/9781119976462.ch11
Egozcue, J. J., & Pawlowsky-Glahn, V. (2005). Groups of parts and their balances in compositional data analysis. Mathematical Geology, 37(7), 795–828.
Egozcue, J. J., Pawlowsky-Glahn, V., Mateu-Figueras, G., & Barcelo-Vidal, C. (2003). Isometric logratio transformations for compositional data analysis. Mathematical Geology, 35(3), 279–300.
Engle, M. A., Gallo, M., Schroeder, K. T., Geboy, N. J., & Zupancic, J. W. (2014). Three-way compositional analysis of water quality monitoring data. Environmental and Ecological Statistics, 21(3), 565–581.
Filzmoser, P., Hron, K., & Templ, M. (2018). Methods for high-dimensional compositional data. In Applied compositional data analysis (pp. 207–225). Springer.
Gallo, M. (2013). Log-ratio and parallel factor analysis: an approach to analyze three-way compositional data. In Advanced dynamic modeling of economic and social systems (pp. 209–221). Springer.
Gallo, M. (2015). Tucker3 model for compositional data. Communications in Statistics-Theory and Methods, 44(21), 4441–4453.
Gallo, M., & Simonacci, V. (2013). A procedure for the three-mode analysis of compositions. Electronic Journal of Applied Statistical Analysis, 6(2), 202–210.
Gallo, M., Simonacci, V., & Todorov, V. (2021). A compositional three-way approach for student satisfaction analysis. In P. Filzmoser, K. Hron, J. Martín-Fernández, & J. Palarea-Albaladejo (Eds.), Advances in compositional data analysis: Festschrift in Honour of Vera Pawlowsky-Glahn (pp. 143–162). Springer.
Harshman, R. A. (1970). Foundations of the PARAFAC procedure: Models and conditions for an “explanatory” multi-modal factor analysis. In UCLA Working papers in phonetics.
Hron, K., Engle, M., Filzmoser, P., & Fišerová, E. (2021). Weighted symmetric pivot coordinates for compositional data with geochemical applications. Mathematical Geosciences, 53(4), 655–674.
Hron, K., Filzmoser, P., de Caritat, P., Fišerová, E., & Gardlo, A. (2017). Weighted pivot coordinates for compositional data and their application to geochemical mapping. Mathematical Geosciences, 49(6), 797–814.
Kiers, H. A. (2000). Towards a standardized notation and terminology in multiway analysis. Journal of Chemometrics, 14(3), 105–122.
Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.
Kroonenberg, P. M. (2008). Applied multiway data analysis (Vol. 702). John Wiley & Sons.
Martín-Fernández, J., Pawlowsky-Glahn, V., Egozcue, J., & Tolosana-Delgado, R. (2018). Advances in principal balances for compositional data. Mathematical Geosciences, 50(3), 273–298.
Mert, M. C., Filzmoser, P., & Hron, K. (2015). Sparse principal balances. Statistical Modelling, 15(2), 159–174.
MIUR (2022). Portale dei dati dell’istruzione superiore. Ministero dell’Universitá e della Ricerca. Retrieved from http://ustat.miur.it/.
Pawlowsky-Glahn, V., & Egozcue, J. J. (2001). Geometric approach to statistical analysis on the simplex. Stochastic Environmental Research and Risk Assessment, 15(5), 384–398.
Pawlowsky-Glahn, V., Egozcue, J. J., & Lovell, D. (2015a). Tools for compositional data with a total. Statistical Modelling, 15(2), 175–190.
Pawlowsky-Glahn, V., Egozcue, J. J., & Tolosana-Delgado, R. (2015b). Modeling and analysis of compositional data. John Wiley & Sons.
Pawlowsky-Glahn, V., Egozcue, J. J., Tolosana Delgado, R., et al. (2011). Principal balances. In J. J. Egozcue, R. Tolosana Delgado, & M. Ortego (Eds.), Proceedings of the 4th international workshop on compositional data analysis. CIMNE. Retrieved from http://hdl.handle.net/2117/364253.
Quinn, T. P. (2018). Visualizing balances of compositional data: A new alternative to balance dendrograms. F1000Research, 7.
R Core Team. (2020). R: A language and environment for statistical computing [Computer software manual]. Vienna, Austria. Retrieved from https://www.R-project.org/.
RStudio Team. (2019). RStudio: Integrated development environment for R [Computer software manual]. Boston, MA. Retrieved from http://www.rstudio.com/.
Simonacci, V., & Gallo, M. (2017). Statistical tools for student evaluation of academic educational quality. Quality & Quantity, 51(2), 565–579.
Simonacci, V., & Gallo, M. (2019). Detecting public social spending patterns in Italy using a three-way relative variation approach. Social Indicators Research, 146(1–2), 205–219.
Todorov, V., Di Palma, M. A., & Gallo, M. (2020). rrcov3way: Robust methods for multiway data analysis, applicable also for compositional data [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=rrcov3way (R package version 0.1-18).
Tucker, L. R. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 279–311.
Van Den Boogaart, K. G., Tolosana-Delgado, R., & Bren, M. (2022). compositions: Compositional data analysis [Computer software manual]. Retrieved from https://CRAN.R-project.org/package=compositions (R package version 2.0-4).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Simonacci, V., Gallo, M. Three-way principal balance analysis: algorithm and interpretation. Ann Oper Res 342, 1429–1443 (2024). https://doi.org/10.1007/s10479-022-04782-5
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10479-022-04782-5