Abstract
In this work we describe a clustering and feature selection technique applied to the analysis of international dietary profiles. An asymmetric entropy-based measure for assessing the similarity between two clusterizations, also taking into account subclustering relationships, is at the core of the technique, together with PCA. Then, a feature analysis of the dataset with respect to its hierarchical clusterization is performed. This way, most significant features of the dataset are found and a deep understanding of the data distribution is made possible.
Preview
Unable to display preview. Download preview PDF.
References
Soussi, T., Beroud, C.: Significance of TP53 mutations in human cancer: a critical analysis of mutations at CpG dinucleotides. Human Mutation 21(3), 192–200 (2003)
Pfeifer, G.P., Denissenko, M.F.: Formation and repair of DNA lesions in the p53 gene: Relation to cancer mutations? Environmental and Molecular Mutagenesis 31(3), 197–205 (1998)
Olivier, Hussain, S.P., Caron de Fromentel, C., Hainaut, P., Harris, C.C.: TP53 mutation spectra and load: a tool for generating hypotheses on the etiology of cancer. IARC Sci Publ 157, 247–270 (2004)
Jollife, I.T.: Principal Component Analysis. Springer-Verlag, New York (1986)
Statistics Toolbox, Matlab, The Mathworks, Inc.
Ben-Hur, A., Elisseeff, A., Guyon, I.: A stability based method for discovering structure in clustered data. In: Pacific Symposium on Biocomputing (2002)
Ciaramella, A., Longo, G., Staiano, A., Tagliaferri, R.: NEC: A Hierarchical Agglomerative Clustering Based on Fisher and Negentropy Information. In: Apolloni, B., Marinaro, M., Nicosia, G., Tagliaferri, R. (eds.) WIRN 2005 and NAIS 2005. LNCS, vol. 3931, pp. 49–56. Springer, Heidelberg (2006)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bishehsari, F. et al. (2007). PCA Based Feature Selection Applied to the Analysis of the International Variation in Diet. In: Masulli, F., Mitra, S., Pasi, G. (eds) Applications of Fuzzy Sets Theory. WILF 2007. Lecture Notes in Computer Science(), vol 4578. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73400-0_70
Download citation
DOI: https://doi.org/10.1007/978-3-540-73400-0_70
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73399-7
Online ISBN: 978-3-540-73400-0
eBook Packages: Computer ScienceComputer Science (R0)