Abstract
This paper is an adaptation of symbolic interval Principal Component Analysis (PCA) to histogram data. We proposed two methodologies. The first one involved three steps: the coding of bins of histogram, the ordinary PCA of means of variables and the representation of dispersion of symbolic observations we call concepts. For the representation of dispersion of these concepts we proposed the transformation of histograms into intervals. Then, we suggest the projection of the hypercubes or the interval lengths associated to each concept on the principal axes of the ordinary PCA of means. In the second methodology, we proposed the use of the three previous steps with the angular transformation.
Similar content being viewed by others
References
Aitchison J (1986) The statistical analysis of compositionnal data. Chapman and Hall, London
Bock H-H, Diday E (2000) Analysis of symbolic data exploratory methods for extracting statistical information from complex data. Springer, Heidelberg, p 425
Billard L, Diday E (2006) Symbolic data analysis: conceptual statistics and data mining. In: Wiley series in computational statistics
Bishop Y, Feinberg S, Holland P (1975) Discrete multivariate analysis, theory and practice. MIT Press, Cambridge
Cazes P, Chouakria A, Diday E, Schektman Y (1997) Extension de l’analyse en composantes principales a des données de type intervalle. Rev Statistique Appliquée 45(3): 5–24
Cazes P (2002) Analyse factorielle d’un tableau de lois de probabilité. Rev Statistique Appliquée 50(3): 5–24
Chessel D, Dufour A-B, Thioulouse J (2004) The ade4 package-IOne- table methods. R News 4: 5–10
Diday E, Noirhomme M (2008) Symbolic data analysis and the SODAS software. Wiley, London
Eckart C, Young G (1936) The approximation of one matrix by another of lower rank. Psychometria 1: 211–218
Escoffier B, Pagès J (1998) Analyses factorielles simples et multiples; objectifs,méthodes et interprètation. 3rd edn. Dunod, Paris
Fisher RA (1922) On the mathematical foundations of theoretical statistics. Philos Trans Roy Soc London Ser A 222: 309–368
Gower JC (1975) Generalized procrustes analysis. Psychometrika 40: 33–51
Husson F, Josse J, Le S, Mazet J (2009) Package FactomineR : an R package for exploratory data analysis. R News, CRAN-2009
Ichino M (2008) Symbolic PCA for histogram-valued data. In: Proceedings IASC. December 5–8, Yokohama, Japan, 2008
Ichino M (2011) The quantile method for symbolic principal component analysis. Stat Anal Data Min 4(2): 184–198
Lavit C (1988) Analyse conjointe de tableaux quantitatifs. Masson, Paris
L’Hermier des Plantes H (1976) Structuration des Tableaux à Trois Indices de la Statistique. Thèse de 3e cycle. Université de Montpellier
Nagabhushan P, Kumar P (2007) Principal component analysis of histogram data. Springer, Berlin
Rodriguez O, Diday E, Winsberg S (2001) Generalization of the principal component analysis to histogram data. Workshop on symbolic data analysis, 4th Europ. Conf. on Princ., Sept. 12–16, 2000, Lyon, 1
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Makosso-Kallyth, S., Diday, E. Adaptation of interval PCA to symbolic histogram variables. Adv Data Anal Classif 6, 147–159 (2012). https://doi.org/10.1007/s11634-012-0108-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11634-012-0108-0