Semi-supervised Clustering Ensemble Based on Collaborative Training

Zhang, Jinyuan; Yang, Yan; Wang, Hongjun; Mahmood, Amjad; Huang, Feifei

doi:10.1007/978-3-642-31900-6_55

Jinyuan Zhang^26,27,
Yan Yang^26,27,
Hongjun Wang^26,27,
Amjad Mahmood^26,27 &
…
Feifei Huang^26,27

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7414))

Included in the following conference series:

International Conference on Rough Sets and Knowledge Technology

1622 Accesses

Abstract

Recent researches on data clustering is increasingly focusing on combining multiple data partitions as a way to improve the robustness of clustering solutions. Most of them focused on crisp clustering combination. Semi-supervised clustering uses a small amount of labeled data to aid and bias the clustering of unlabeled data. However, in this paper, we offer a semi-supervised clustering ensemble model based on collaborative training (SCET) and an unsupervised clustering ensemble mode based on collaborative training (UCET). In the ensemble step of SCET, semi-supervised learning is introduced. While in UCET, the knowledge used in SCET is replaced by information extracted from the base-clusterings. Then tri-training is used as consensus of clustering ensemble. The experiments on datasets from UCI machine learning repository indicate that the model improves the accuracy of clustering.

This work is partially supported by the National Science Foundation of China (Nos. 61170111, 61003142 and 61152001) and the Fundamental Research Funds for the Central Universities (No. SWJTU11ZT08).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Two-stage semi-supervised clustering ensemble framework based on constraint weight

Article 01 November 2022

Ensemble Re-clustering: Refinement of Hard Clustering by Three-Way Strategy

Combined constraint-based with metric-based in semi-supervised clustering ensemble

Article 17 February 2017

References

Strehl, A., Gosh, J.: Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. J. Mach. Learn. Res., 583–617 (2002)
Google Scholar
Yang, Y., Kamel, M., Jin, F.: Topic Discovery from Document Using Ant-Based Clustering Combination. In: Zhang, Y., Tanaka, K., Yu, J.X., Wang, S., Li, M. (eds.) APWeb 2005. LNCS, vol. 3399, pp. 100–108. Springer, Heidelberg (2005)
Chapter Google Scholar
Luo, H., Wei, H.: Clustering Ensemble Algorithm Based on Mathematical Morphology. Computer Science, 214–218 (2010) (in Chinese)
Google Scholar
Ayad, H.G., Kamel, M.S.: On voting-based consensus of cluster ensembles. Pattern Recogn. 43, 1943–1953 (2010)
Article MATH Google Scholar
Vega-Pons, S., Ruiz-Shulcloper, J.: A survey of clustering ensemble algorithms. International Journal for Pattern Recognition and Artifitial Intelligence 25, 337–372 (2011)
Article Google Scholar
Zhang, Y., Li, T.: Consensus Clustering + Meta Clustering = Multiple Consensus Clustering. In: Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, America (2011)
Google Scholar
Wang, H.J., Li, Z.S., Qi, J.H., Cheng, Y., Zhou, P., Zhou, W.: Semi-supervised Cluster Ensemble Model Based on Bayesian Network. Journal of Software 21, 2814–2825 (2010) (in Chinese)
Article Google Scholar
Du, J., Ling, C.X., Zhou, Z.H.: When Does Cotraining Work in Real Data? IEEE T. Knowl. Data En. 23, 788–799 (2011)
Article Google Scholar
Blum, A., Mitchell, T.: Combining Labeled and Unlabeled Data with Co-Training, pp. 92–100 (1998)
Google Scholar
Zhou, Z., Li, M.: Tri-training: exploiting unlabeled data using three classifiers. IEEE T. Knowl. Data En. 17, 1529–1541 (2005)
Article Google Scholar
Frank, A., Asuncion, A.: UCI Machine Learning Repository. University of California, School of Information and Computer Science, Irvine (2010), http://archive.ics.uci.edu/ml/

Download references

Author information

Authors and Affiliations

School of Information Science & Technology, Southwest Jiaotong University, Chengdu, 610031, P.R. China
Jinyuan Zhang, Yan Yang, Hongjun Wang, Amjad Mahmood & Feifei Huang
Key Lab. of Cloud Computing and Intelligent Technology, Chengdu, Sichuan Province, 610031, P.R. China
Jinyuan Zhang, Yan Yang, Hongjun Wang, Amjad Mahmood & Feifei Huang

Authors

Jinyuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Hongjun Wang
View author publications
You can also search for this author in PubMed Google Scholar
Amjad Mahmood
View author publications
You can also search for this author in PubMed Google Scholar
Feifei Huang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Information Science and Technology, Southwest Jiaotong University, 610031, Chengdu, P.R. China
Tianrui Li
Institute of Mathematics, The University of Warsaw, ul. Banacha 2, 02-097, Warsaw, Poland
Hung Son Nguyen
Institute of Computer Science and Technology, Chongqing University of Posts and Telecommunications, 400065, Chongqing, China
Guoyin Wang
Department of Electrical Engineering and Computer Science, University of Kansas, 66045 - 7621, Lawrence, KS, USA
Jerzy Grzymala-Busse
Department of Computing and Software, McMaster University, L8S 4K1, Hamilton, Ontario, Canada
Ryszard Janicki
Faculty of Computers and Information, Cairo University, Cairo, Egypt
Aboul Ella Hassanien
Software School, Dalian University of Technology, Dalian, China
Hong Yu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, J., Yang, Y., Wang, H., Mahmood, A., Huang, F. (2012). Semi-supervised Clustering Ensemble Based on Collaborative Training. In: Li, T., et al. Rough Sets and Knowledge Technology. RSKT 2012. Lecture Notes in Computer Science(), vol 7414. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31900-6_55

Download citation

DOI: https://doi.org/10.1007/978-3-642-31900-6_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31899-3
Online ISBN: 978-3-642-31900-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics