Average Cluster Consistency for Cluster Ensemble Selection

Duarte, F. Jorge F.; Duarte, João M. M.; Fred, Ana L. N.; Rodrigues, M. Fátima C.

doi:10.1007/978-3-642-19032-2_10

F. Jorge F. Duarte⁵,
João M. M. Duarte^5,6,
Ana L. N. Fred⁶ &
…
M. Fátima C. Rodrigues⁵

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 128))

Included in the following conference series:

International Joint Conference on Knowledge Discovery, Knowledge Engineering, and Knowledge Management

885 Accesses

Abstract

Various approaches to produce cluster ensembles and several consensus functions to combine data partitions have been proposed in order to obtain a more robust partition of the data. However, the existence of many approaches leads to another problem which consists in knowing which of these approaches to produce the cluster ensembles’ data and to combine these partitions best fits a given data set. In this paper, we propose a new measure to select the best consensus data partition, among a variety of consensus partitions, based on the concept of average cluster consistency between each data partition that belongs to the cluster ensemble and a given consensus partition. The experimental results obtained by comparing this measure with other measures for cluster ensemble selection in 9 data sets, showed that the partitions selected by our measure generally were of superior quality in comparison with the consensus partitions selected by other measures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Consensus function based on cluster-wise two level clustering

Article 26 July 2020

Clustering ensemble method

Article Open access 16 January 2018

An Exploratory Study of the Inputs for Ensemble Clustering Technique as a Subset Selection Problem

References

Fred, A.L.N.: Finding consistent clusters in data partitions. In: Kittler, J., Roli, F. (eds.) MCS 2001. LNCS, vol. 2096, pp. 309–318. Springer, Heidelberg (2001)
Chapter Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles — a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583–617 (2003)
MathSciNet MATH Google Scholar
Fred, A.L.N., Jain, A.K.: Combining multiple clusterings using evidence accumulation. IEEE Trans. Pattern Anal. Mach. Intell. 27(6), 835–850 (2005)
Article Google Scholar
Duarte, F.J., Fred, A.L.N., Rodrigues, M.F.C., Duarte, J.: Weighted evidence accumulation clustering using subsampling. In: Sixth International Workshop on Pattern Recognition in Information Systems (2006)
Google Scholar
Fern, X., Brodley, C.: Solving cluster ensemble problems by bipartite graph partitioning. In: ICML 2004: Proceedings of the Twenty-First International Conference on Machine Learning, vol. 36. ACM, New York (2004)
Google Scholar
Topchy, A.P., Jain, A.K., Punch, W.F.: A mixture model for clustering ensembles. In: Berry, M.W., Dayal, U., Kamath, C., Skillicorn, D.B. (eds.) SDM. SIAM, Philadelphia (2004)
Google Scholar
Jouve, P., Nicoloyannis, N.: A new method for combining partitions, applications for distributed clustering. In: International Workshop on Paralell and Distributed Machine Learning and Data Mining (ECML/PKDD 2003), pp. 35–46 (2003)
Google Scholar
Topchy, A., Minaei-Bidgoli, B., Jain, A.K., Punch, W.F.: Adaptive clustering ensembles. In: ICPR 2004: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 1, pp. 272–275. IEEE Computer Society, Los Alamitos (2004)
Chapter Google Scholar
Topchy, A., Jain, A.K., Punch, W.: Combining multiple weak clusterings, pp. 331–338 (2003)
Google Scholar
Hadjitodorov, S.T., Kuncheva, L.I., Todorova, L.P.: Moderate diversity for better cluster ensembles. Inf. Fusion 7(3), 264–275 (2006)
Article Google Scholar
Hubert, L., Arabie, P.: Comparing partitions. Journal of Classification (October 1985)
Google Scholar
Kuncheva, L., Hadjitodorov, S.: Using diversity in cluster ensembles, vol. 2, pp. 1214–1219 (October 2004)
Google Scholar
Duarte, F., Duarte, J., Fred, A., Rodrigues, F.: Cluster ensemble selection - using average cluster consistency. In: International Conference on Discovery and Information Retrieval (KDIR 2009), Funchal, October 6-8, pp. 85–95 (2009)
Google Scholar
Sneath, P., Sokal, R.: Numerical taxonomy. Freeman, London (1973)
MATH Google Scholar
King, B.: Step-wise clustering procedures. Journal of the American Statistical Association (69), 86–101 (1973)
Google Scholar
Macqueen, J.B.: Some methods of classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathemtical Statistics and Probability, pp. 281–297 (1967)
Google Scholar
Ng, R.T., Han, J.: Clarans: A method for clustering objects for spatial data mining. IEEE Trans. on Knowl. and Data Eng. 14(5), 1003–1016 (2002)
Article Google Scholar
Karypis, G., Han, E., News, V.K.: Chameleon: Hierarchical clustering using dynamic modeling. Computer 32(8), 68–75 (1999)
Article Google Scholar
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. SIGMOD Rec. 27(2), 94–105 (1998)
Article Google Scholar
Guha, S., Rastogi, R., Shim, K.: Cure: an efficient clustering algorithm for large databases. In: SIGMOD 1998: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, pp. 73–84. ACM, New York (1998)
Chapter Google Scholar
Ester, M., Kriegel, H.P., Jörg, S., Xu, X.: A density-based algorithm for discovering clusters in large spatial databases with noise (1996)
Google Scholar
Wang, W., Yang, J., Muntz, R.R.: Sting: A statistical information grid approach to spatial data mining. In: VLDB 1997: Proceedings of the 23rd International Conference on Very Large Data Bases, pp. 186–195. Morgan Kaufmann Publishers Inc., San Francisco (1997)
Google Scholar
Ward, J.H.: Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58(301), 236–244 (1963)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

GECAD - Knowledge Engineering and Decision Support Group, Instituto Superior de Engenharia do Porto, R. Dr. António Bernardino de Almeida, 431, P-4200-072, Porto, Portugal
F. Jorge F. Duarte, João M. M. Duarte & M. Fátima C. Rodrigues
Instituto de Telecomunicações, Instituto Superior Técnico, Av. Rovisco Pais, 1, P-1049-001, Lisboa, Portugal
João M. M. Duarte & Ana L. N. Fred

Authors

F. Jorge F. Duarte
View author publications
You can also search for this author in PubMed Google Scholar
João M. M. Duarte
View author publications
You can also search for this author in PubMed Google Scholar
Ana L. N. Fred
View author publications
You can also search for this author in PubMed Google Scholar
M. Fátima C. Rodrigues
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

IST - Technical University of Lisbon, Av.Rovisco Pais, 1, 1049-001, Lisbon, Portugal
Ana Fred
Delft University of Technology, Mekelweg 4, 2628, Delft, CD, The Netherlands
Jan L. G. Dietz
Informatics Research Centre, Henley Business School, University of Reading, RG6 6UD, Reading, UK
Kecheng Liu
Departament of Systems and Informatics, Polytechnic Institute of Setúbal – INSTICC, Rua do Vale de Chaves - Estefanilha, 2910-761, Setúbal, Portugal
Joaquim Filipe

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Duarte, F.J.F., Duarte, J.M.M., Fred, A.L.N., Rodrigues, M.F.C. (2011). Average Cluster Consistency for Cluster Ensemble Selection. In: Fred, A., Dietz, J.L.G., Liu, K., Filipe, J. (eds) Knowledge Discovery, Knowlege Engineering and Knowledge Management. IC3K 2009. Communications in Computer and Information Science, vol 128. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19032-2_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-19032-2_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19031-5
Online ISBN: 978-3-642-19032-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics