Abstract
Clustering is one of the major tasks in data mining. However, selecting an algorithm to cluster a dataset is a difficult task, especially if there is no prior knowledge on the structure of the data. Consensus clustering methods can be used to combine multiple base clusterings into a new solution that provides better partitioning. In this work, we present a new consensus clustering method based on detecting clustering patterns by mining frequent closed itemset. Instead of generating one consensus, this method both generates multiple consensuses based on varying the number of base clusterings, and links these solutions in a hierarchical representation that eases the selection of the best clustering. This hierarchical view also provides an analysis tool, for example to discover strong clusters or outlier instances.
Keywords
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
See Sect. 3.1 for a definition of cluster membership matrix.
- 2.
Generating only clustering patterns of maximum agreement between base clusterings reduces processing time.
- 3.
This is the objective of clustering algorithms, yet they differ in how they define the similarity between instances.
- 4.
Another possibility is to use the arules R package [6].
References
Asur, S., Ucar, D., Parthasarathy, S.: An ensemble framework for clustering protein-protein interaction networks. Bioinformatics 23(13), i29–i40 (2007)
Caruana, R., Elhawary, M., Nguyen, N., Smith, C.: Meta clustering. In: Proceedings of the IEEE ICDM Conference, pp. 107–118 (2006)
Csardi, G., Nepusz, T.: The igraph software package for complex network research. InterJournal Complex Systems, 1695 (2006). http://igraph.org
Dalton, L., Ballarin, V., Brun, M.: Clustering algorithms: on learning, validation, performance, and applications to genomics. Curr. Genomics 10(6), 430 (2009)
Ghaemi, R., Sulaiman, M.N., Ibrahim, H., Mustapha, N.: A survey: clustering ensembles techniques. WASET 50, 636–645 (2009)
Hahsler, M., Gruen, B., Hornik, K.: arules - a computational environment for mining association rules and frequent item sets. J. Stat. Softw. 14(15), 1–25 (2005)
Halkidi, M., Batistakis, Y., Vazirgiannis, M.: On clustering validation techniques. J. Intell. Inf. Syst. 17(2), 107–145 (2001)
Hornik, K.: A CLUE for CLUster Ensembles. J. Stat. Softw. 14(12), 1–25 (2005)
Hornik, K.: CLUE: Cluster ensembles (2015). r package version 0.3-50 http://CRAN.R-project.org/package=clue
Jaccard, P.: The distribution of the flora in the alpine zone.1. New Phytol. 11(2), 37–50 (1912). doi:10.1111/j.1469-8137.1912.tb05611.x
Li, T., Ding, C.: Weighted consensus clustering. In: Proceedings of the SIAM Conference on Data Mining, pp. 798–809 (2008)
Lichman, M.: UCI machine learning repository (2013). http://archive.ics.uci.edu/ml
Mondal, K.C., Pasquier, N., Mukhopadhyay, A., Maulik, U., Bandhopadyay, S.: A new approach for association rule mining and bi-clustering using formal concept analysis. In: Perner, P. (ed.) MLDM 2012. LNCS, vol. 7376, pp. 86–101. Springer, Heidelberg (2012)
Newman, D., Hettich, S., Blake, C., Merz, C.: UCI repository of machine learning databases (1998). http://www.ics.uci.edu/~mlearn/MLRepository.html
Pasquier, N., Bastide, Y., Taouil, R., Lakhal, L.: Efficient mining of association rules using closed itemset lattices. Inf. Syst. 24(1), 25–46 (1999)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2015). https://www.R-project.org/
Sarumathi, S., Shanthi, N., Sharmila, M.: A comparative analysis of different categorical data clustering ensemble methods in data mining. IJCA 81(4), 46–55 (2013)
Strehl, A., Ghosh, J.: Cluster ensembles - a knowledge reuse framework for combining multiple partitions. JMLR 3, 583–617 (2003)
Ultsch, A.: Clustering with SOM: U*C. In: Proceedings of the WSOM Workshop, pp. 75–82 (2005)
Vega-Pons, S., Ruiz-Shulcloper, J.: A survey of clustering ensemble algorithms. IJPRAI 25(03), 337–372 (2011)
Wu, O., Hu, W., Maybank, S.J., Zhu, M., Li, B.: Efficient clustering aggregation based on data fragments. IEEE Trans. Syst. Man Cybern B Cybern. 42(3), 913–926 (2012)
Xu, D., Tian, Y.: A comprehensive survey of clustering algorithms. Ann. Data Sci. 2(2), 165–193 (2015)
Yang, G.: The complexity of mining maximal frequent itemsets and maximal frequent patterns. In: ACM SIGKDD, pp. 344–353 (2004)
Zhang, Y., Li, T.: Consensus clustering + meta clustering = multiple consensus clustering. In: Proceedings of the FLAIRS Conference (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Al-Najdi, A., Pasquier, N., Precioso, F. (2016). Frequent Closed Patterns Based Multiple Consensus Clustering. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L., Zurada, J. (eds) Artificial Intelligence and Soft Computing. ICAISC 2016. Lecture Notes in Computer Science(), vol 9693. Springer, Cham. https://doi.org/10.1007/978-3-319-39384-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-39384-1_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-39383-4
Online ISBN: 978-3-319-39384-1
eBook Packages: Computer ScienceComputer Science (R0)