MCC: a Multiple Consensus Clustering Framework

Li, Tao; Zhang, Yi; Wang, Dingding; Xu, Jian

doi:10.1007/s00357-019-09318-4

MCC: a Multiple Consensus Clustering Framework

Published: 09 August 2019

Volume 36, pages 414–434, (2019)
Cite this article

Journal of Classification Aims and scope Submit manuscript

Tao Li¹,
Yi Zhang²,
Dingding Wang³ &
…
Jian Xu⁴

329 Accesses
1 Citation
Explore all metrics

Abstract

Consensus clustering has emerged as an important extension of the classical clustering problem. Given a set of input clusterings of a given dataset, consensus clustering aims to find a single final clustering which is a better fit in some sense than the existing clusterings. There is a significant drawback in generating a single consensus clustering since different input clusterings could differ significantly. In this paper, we develop a new framework, called Multiple Consensus Clustering (MCC), to explore multiple clustering views of a given dataset from a set of input clusterings. Instead of generating a single consensus, we propose two sets of approaches to obtain multiple consensus. One employs the meta clustering method, and the other uses a hierarchical tree structure and further applies a dynamic programming algorithm to generate a flat partition from the hierarchical tree using the modularity measure. Multiple consensuses are finally obtained by applying consensus clustering algorithms to each cluster of the partition. Extensive experimental results on 11 real-world datasets and a case study on a Protein-Protein Interaction (PPI) dataset demonstrate the effectiveness of the MCC framework.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multiple Consensuses Clustering by Iterative Merging/Splitting of Clustering Patterns

A Review on Consensus Clustering Methods

$$SC^2$$ : A Selection-Based Consensus Clustering Approach

Notes

The software can be downloaded from http://glaros.dtc.umn.edu/gkhome/views/metis/.

References

Asa, B.-H., Elisseeff, A., Guyon, I. (2002). A stability based method for discovering structure in clustered data, Pacific Symposium on Biocomputing.
Ashburner, M., Ball, C., Blake, J., Botstein, D., Butler, H., Michael, J., Davis, A., Dolinski, K., Dwight, S., Eppig, J., Harri, M., Hill, D., Traver, L., Kassarskis, A., Levis, S., Matese, J., Richardson, E., Ringwald, M., Rubin, G., Sherlock, G. (2000). Gene ontology: tool for the unification of biology. Nature Genetics, 25, 24–29.
Article Google Scholar
Asur, S., Ucar, D., Parthasarathy, S. (2007). An ensemble framework for clustering protein-protein interaction networks. Bioinformatics, 23(13), i29–i40.
Article Google Scholar
Azimi, J., & Fern, X. (2009). Adaptive cluster ensemble selection. In Proceedings of International Joint Conference on Artificial Intellegence (pp. 993–997).
Blake, C.L., & Merz, C.J. (1998). UCI repository of machine learning databases.
Brandes, U., Delling, D., Gaertler, M., Gorke, R., Hoefer, M., Nikoloski, Z., Wagner, D. (2008). On modularity clustering. IEEE Transactions on in Knowledge and Data Engineering, 20(2), 172–188.
Article Google Scholar
Bronstein, M.M., Bronstein, A.M., Kimmel, R., Yavneh, I. (2006). Multigrid multidimensional scaling. In Numerical Linear Algebra with Applications (NLAA), 13:149C171, March–April (pp. 149–171).
Article MathSciNet Google Scholar
Caruana, R., Elhawary, M., Nguyen, N. (2006). Meta clustering. In Proceedings IEEE International Conference on Data Mining.
Cui, Y., Fern, X.Z., Dy, J. (2007). Non-redundant multi-view clustering via orthogonalization. In ICDM (pp. 133–142).
Ding, C., & He, X. (2002). Cluster merging and splitting in hierarchical clustering algorithms. In ICDM (pp. 139–146).
Dongen, S.V., & Dongen, S.V. (2000). Performance criteria for graph clustering and Markov cluster experiments, Technical report INS-R0012, National Research Institute for Mathematics and Computer Science.
Fallah, S., Tritchler, D., Beyene, J. (2008). Estimating number of clusters based on a general similarity matrix with application to microarray data. Journal of Statistical Applications in Genetics and Molecular Biology, 7(1), 1–25.
MathSciNet MATH Google Scholar
Fern, X.Z., Brodley, C.E., Fern, X.Z., Brodley, C.E. (2004). Solving cluster ensemble problems by bipartite graph partitioning. In Proceedings of the International Conference on Machine Learning.
Fern, X.Z., & Lin, W. (2008). Cluster ensemble selection. Journal of Statistical Analysis and Data Mining, 1(3), 128–141.
Article MathSciNet Google Scholar
Fred, A.L., & Jain, A.K. (2003). Robust data clustering. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2(128).
Gionis, A., Mannila, H., Tsaparas, P. (2005). Clustering aggregation. In Proceedings of the 21st International Conference on Data Engineering ICDE (pp. 341–352).
Han, E.-H., Boley, D., Gini, M., Gross, R., Hastings, K., Karypis, G., Kumar, V., Mobasher, B., Moore, J. (1998). WebACE: a Web agent for document categorization and exploration. In Proceedings of the 2nd International Conference on Autonomous Agents (pp. 408–415).
Hu, X., Yoo, I., Zhang, X., Nanavati, P., Das, D. (2005). Wavelet transformation and cluster ensemble for gene expression analysis. International Journal of Bioinformatics Research and Applications, 1(4), 447–460.
Article Google Scholar
Li, T., & Ding, C. (2006). The relationships among various nonnegative matrix factorization methods for clustering. In Proceedings of IEEE International Conference on Data Mining 2006 (pp. 362–371).
Li, T., & Ding, C. (2008). Weighted consensus clustering. In Proceedings of 2008 SIAM International Conference on Data Mining (pp. 798–809).
Li, T., Ding, C., Jordan, M.I. (2007). Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In Proceedings of the 7st IEEE International Conference on data Mining (pp. 577–582).
Mallows, C.L. (1972). A note on asymptotic joint normality. The Annals of Mathematical Statistics, 43(2), 508–515.
Article MathSciNet Google Scholar
McCallum, A.K. (1996). Bow: a toolkit for statistical language modeling, text retrieval, classification and clustering,. http://www.cs.cmu.edu/mccallum/bow.
Meila, M. (2002). Comparing clusterings, Technical report, Statistics, University of Washington.
Navlakha, S., Rastogi, R., Shrivastava, N. (2008). Graph summarization with bounded error. In SIGMOD (pp. 419–432).
Navlakha, S., White, J., Nagarajan, N., Pop, M., Kingsford, C. (2009). Finding biologically accurate clusterings in hierarchical tree decompositions using the variation of information. In Inproceedings of the 13th Annual International Conference on Research in Computational Molecular Biology (pp. 400–417).
Newman, M.E.J. (2006). Modularity and community structure in networks. In PNAS (pp. 8577–8582).
Article Google Scholar
Newman, M.E.J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Article Google Scholar
Qi, Z., & Davidson, I. (2009). A principled and flexible framework for finding alternative clusterings. In SIGKDD (pp. 717–726).
Rand, W.M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66(336), 846–850.
Article Google Scholar
Reichardt, J., & Bornholdt, S. (2006). Statistical mechanics of community detection. Physical Review. E, Statistical, Nonlinear, and Soft Matter Physics, 74(1), 016110.
Article MathSciNet Google Scholar
Shlens, J. (2009). A tutorial on principal component analysis, Technical report, Center for Neural Science, New York University.
Strehl, A., & Ghosh, J. (2003). Relationship-based clustering and visualization for high-dimensional data mining. INFORMS Journal on Computing, 15(2), 208–230.
Article Google Scholar
Strehl, A., Ghosh, J., Cardie, C. (2002). Cluster ensembles - a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 3, 583–617.
MathSciNet MATH Google Scholar
Tan, P.-N., Steinbach, M., Kumar, V. (2005). Introduction to data mining. Reading: Addison-Wesley Longman Publishing Co.
Google Scholar
von Luxburg, U. (n.d.) A tutorial on spectral clustering, Techonical report.
Wu, J., Xiong, H., Chen, J. (2009). Towards understanding hierarchical clustering: a data distribution perspective. Neurocomputing, 72(10-12), 2319–2330.
Article Google Scholar
Zhang, Y., Zeng, E., Li, T., Narasimhan, G. (2009). Weighted consensus clustering for identifying functional modules in protein-protein interaction networks. In The 8th International Conference on Machine Learning and Applications (pp. 539–544).
Zhanga, S., Ning, X., Zhang, X. -S. (2006). Identification of functional modules in a PPI network by clique percolation clustering. Journal of Computational Biology and Chemistry, 30(6), 445–451.
Article Google Scholar
Zhao, Y., & Karypis, G. (2002). Evaluation of hierarchical clustering algorithms for document datasets. In Conference of Information and Knowledge Management (pp. 515–524).
Zhou, D., Li, J., Zha, H. (2005). A new mallows distance based metric for comparing clusterings. In Proceeding of International Conference on Machine Learning (pp. 1028–1035).

Download references

Funding

The work is partially supported by NSF grants DBI-0850203 and HRD-0833093. This work is partially supported by the National Science Foundation under grants DBI-0850203, HRD-0833093, CNS-1126619, IIS-1213026, and CNS-1461926, the U.S. Department of Homeland Security VACCINE Center under Award Number 2009-ST-061-CI0001, and an FIU Dissertation Year Fellowship.

Author information

Authors and Affiliations

School of Computer Science, Florida International University, 11200 SW 8th Street, Miami, FL, 33199, USA
Tao Li
Seattle, USA
Yi Zhang
CEECS Department, Florida Atlantic University, 777 Glades Rd, Boca Raton, FL, 33431, USA
Dingding Wang
Nanjing, China
Jian Xu

Authors

Tao Li
View author publications
You can also search for this author in PubMed Google Scholar
Yi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Dingding Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jian Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tao Li.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, T., Zhang, Y., Wang, D. et al. MCC: a Multiple Consensus Clustering Framework. J Classif 36, 414–434 (2019). https://doi.org/10.1007/s00357-019-09318-4

Download citation

Published: 09 August 2019
Issue Date: October 2019
DOI: https://doi.org/10.1007/s00357-019-09318-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MCC: a Multiple Consensus Clustering Framework

Abstract

Access this article

Similar content being viewed by others

Multiple Consensuses Clustering by Iterative Merging/Splitting of Clustering Patterns

A Review on Consensus Clustering Methods

$$SC^2$$ : A Selection-Based Consensus Clustering Approach

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MCC: a Multiple Consensus Clustering Framework

Abstract

Access this article

Similar content being viewed by others

Multiple Consensuses Clustering by Iterative Merging/Splitting of Clustering Patterns

A Review on Consensus Clustering Methods

$$SC^2$$ : A Selection-Based Consensus Clustering Approach

Notes

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation