Symbolic Clustering of Large Datasets

Lechevallier, Yves; Verde, Rosanna; de Carvalho, Francisco de A. T.

doi:10.1007/3-540-34416-0_21

Yves Lechevallier²²,
Rosanna Verde²³ &
Francisco de A. T. de Carvalho²⁴

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

2719 Accesses

Abstract

We present an approach to cluster large datasets that integrates the Kohonen Self Organizing Maps (SOM) with a dynamic clustering algorithm of symbolic data (SCLUST). A preliminary data reduction using SOM algorithm is performed. As a result, the individual measurements are replaced by micro-clusters. These micro-clusters are then grouped in a few clusters which are modeled by symbolic objects. By computing the extension of these symbolic objects, symbolic clustering algorithm allows discovering the natural classes. An application on a real data set shows the usefulness of this methodology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Clustering of modal-valued symbolic data

Article 24 October 2020

Theoretical and Applied Aspects of the Self-Organizing Maps

Models of the representation and clustering of semistructured information

Article 01 December 2015

References

AMBROISE, C., SEZE, G., BADRAN, F. and THIRIA, S. (2000): Hierarchical clustering of Self-Organizing Maps for cloud classification. Neurocomputing, 30, 47–52.
Article Google Scholar
BOCK, H.H. and DIDAY, E. (2000): Analysis of Symbolic Data, Exploratory methods for extracting statistical information from complex data. Springer, Heidelberg.
Google Scholar
BREIMAN, L., FRIEDMAN, J.H., OSLHEN, R.A. and STONE, C.J. (1984): Classification and regression trees. Chapman & Hall/CRC.
Google Scholar
CELEUX, G., DIDAY, E., GOVAERT, G., LECHEVALLIER, Y. and RALAMBONDRAINY, H. (1988): Classification Automatique des Données: Environnement Statistique et Informatique. Dunod, Gauthier-Villards, Paris.
Google Scholar
CHAVENT, M. and LECHEVALLIER, Y. (2002). Dynamical Clustering Algorithm of Interval Data: Optimization of an Adequacy Criterion Based on Hausdorff Distance. In: A. Sokolowski and H.-H. Bock (Eds.): Classification, Clustering and Data Analysis. Springer, Heidelberg, 53–59.
Google Scholar
CHAVENT, M., DE CARVALHO, F.A.T., LECHEVALLIER, Y. and VERDE, R. (2003). Trois nouvelles mthodes de classification automatique de donnes symboliques de type intervalle. Revue de Statistique Applique, v. LI, n. 4, p. 5–29.
Google Scholar
DE CARVALHO, F.A.T., VERDE, R. and LECHEVALLIER, Y. (1999). A dynamical clustering of symbolic objcts based on a context dependent proximity measure. In: Proceedings of the IX International Symposium on Applied Stochastic Models and Data analysis. Lisboa, p. 237–242.
Google Scholar
DIDAY, E. and SIMON, J.J. (1976): Clustering Analysis. In: Fu, K. S. (Eds): Digital Pattern Recognition. Springer-Verlag, Heidelberg, 47–94.
Google Scholar
DIDAY, E. (2001). An Introduction to Symbolic Data Analysis and SODAS software. Tutorial on Symbolic Data Analysis. GfKl 2001, Munich.
Google Scholar
GORDON, A.D. (1999): Classification. Chapman and Hall/CRC, Florida.
MATH Google Scholar
ICHINO, M. and YAGUCHI, H. (1994). Generalized Minkowski Metrics for Mixed Feature Type Data Analysis. IEEE Trans. Systems Man and Cybernetics, 1, 494–497.
MathSciNet Google Scholar
LECHEVALLIER, Y. and CIAMPI A. (2004): Clustering large and Multi-levels Data Sets. In: International Conference on Statistics in Heath Sciences 2004, Nantes.
Google Scholar
MICHALSKI, R.S., DIDAY, E. and STEPP, R.E.(1981). A recent advance in data analysis: Clustering Objects into classes characterized by conjunctive concepts. In: Kanal L. N., Rosenfeld A. (Eds.): Progress in pattern recognition. North-Holland, 33–56.
Google Scholar
MURTAGH, F. (1995): Interpreting the Kohonen self-organizing feature map using contiguity-constrained clustering. Patterns Recognition Letters, 16, 399–408.
Article Google Scholar
VERDE, R., LECHEVALLIER, Y. and DE CARVALHO, F.A.T. (2001): A dynamical clustering algorithm for symbolic data. Tutorial Symbolic Data Analysis, GfKl, Munich.
Google Scholar

Download references

Author information

Authors and Affiliations

Domaine de Voluceau, Rocquencourt, B.P. 105, 78153, Le Chesnay Cedex, France
Yves Lechevallier
Dip. di Strategie Aziendali e Metod. Quantitative, Seconda Universitá di Napoli, Piazza Umberto I, 81043, Capua (CE), Italy
Rosanna Verde
Centro de Informatica - CIn/UFPE, Cidade Universitaria, Av. Prof. Luiz Freire, s/n, CEP 50740-540, Recife-PE, Brasil
Francisco de A. T. de Carvalho

Authors

Yves Lechevallier
View author publications
You can also search for this author in PubMed Google Scholar
Rosanna Verde
View author publications
You can also search for this author in PubMed Google Scholar
Francisco de A. T. de Carvalho
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Mathematics, FMF, University of Ljubljana, Jadranska 19, 1000, Ljubljana, Slovenia
Vladimir Batagelj
Institute of Statistics, RWTH Aachen University, 52056, Aachen, Germany
Hans-Hermann Bock
Faculty of Social Sciences, University of Ljubljana, Kardeljeva pl. 5, 1000, Ljubljana, Slovenia
Anuška Ferligoj & Aleš Žiberna &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lechevallier, Y., Verde, R., de Carvalho, F.d.A.T. (2006). Symbolic Clustering of Large Datasets. In: Batagelj, V., Bock, HH., Ferligoj, A., Žiberna, A. (eds) Data Science and Classification. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg . https://doi.org/10.1007/3-540-34416-0_21

Download citation

DOI: https://doi.org/10.1007/3-540-34416-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34415-5
Online ISBN: 978-3-540-34416-2
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Symbolic Clustering of Large Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Clustering of modal-valued symbolic data

Theoretical and Applied Aspects of the Self-Organizing Maps

Models of the representation and clustering of semistructured information

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Symbolic Clustering of Large Datasets

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

Clustering of modal-valued symbolic data

Theoretical and Applied Aspects of the Self-Organizing Maps

Models of the representation and clustering of semistructured information

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation