A Self-Supervised Framework for Clustering Ensemble

Du, Liang; Shen, Yi-Dong; Shen, Zhiyong; Wang, Jianying; Xu, Zhiwu

doi:10.1007/978-3-642-38562-9_26

Liang Du^21,22,23,
Yi-Dong Shen^21,22,23,
Zhiyong Shen²⁴,
Jianying Wang²⁵ &
…
Zhiwu Xu^21,22,23

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7923))

Included in the following conference series:

International Conference on Web-Age Information Management

3449 Accesses
3 Citations

Abstract

Clustering ensemble refers to combine a number of base clusterings for a particular data set into a consensus clustering solution. In this paper, we propose a novel self-supervised learning framework for clustering ensemble. Specifically, we treat the base clusterings as pseudo class labels and learn classifiers for each of them. By adding priors to the parameters of these classifiers, we capture the relationships between different base clusterings and meanwhile obtain a a single consolidated clustering result. In the proposed framework, we are able to incorporate the original data features to improve the performance of clustering ensemble. Another advantage, which distinguishes the proposed framework from the traditional clustering ensemble approaches, is with the generalization capability, i.e. it is able to assign the incoming data instances to the consensus clusters directly based on the original data features. We conduct extensive experiments on multiple real world data sets to show the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Gionis, A., Mannila, H., Tsaparas, P.: Clustering aggregation. ACM Transactions on Knowledge Discovery from Data (TKDD) 1(1), 4 (2007)
Article Google Scholar
Monti, S., Tamayo, P., Mesirov, J., Golub, T.: Consensus clustering: A resampling-based method for class discovery and visualization of gene expression microarray data. Machine Learning 52(1), 91–118 (2003)
Article MATH Google Scholar
Strehl, A., Ghosh, J.: Cluster ensembles—a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research 3, 583–617 (2003)
MathSciNet MATH Google Scholar
Fern, X.Z., Brodley, C.E.: Solving cluster ensemble problems by bipartite graph partitioning. In: Proceedings of the Twenty-first International Conference on Machine Learning, p. 36. ACM (2004)
Google Scholar
Al-Razgan, M., Domeniconi, C.: Weighted clustering ensembles. In: Proceedings of 6th SIAM International Conference on Data Mining, pp. 258–269 (2006)
Google Scholar
Topchy, A., Jain, A.K., Punch, W.: A mixture model for clustering ensembles. In: Proceedings of 4th SIAM International Conference on Data Mining, pp. 379–390 (2004)
Google Scholar
Wang, H., Shan, H., Banerjee, A.: Bayesian cluster ensembles. Statistical Analysis and Data Mining 4(1), 54–70 (2011)
Article MathSciNet Google Scholar
Li, T., Ding, C., Jordan, M.I.: Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization. In: Proceedings of the 7th IEEE International Conference on Data Mining, pp. 577–582 (2007)
Google Scholar
Li, T., Ding, C.: Weighted consensus clustering. In: Proceedings of the 8th SIAM International Conference on Data Mining, pp. 798–809 (2008)
Google Scholar
Du, L., Li, X., Shen, Y.-D.: Cluster ensembles via weighted graph regularized nonnegative matrix factorization. In: Tang, J., King, I., Chen, L., Wang, J. (eds.) ADMA 2011, Part I. LNCS, vol. 7120, pp. 215–228. Springer, Heidelberg (2011)
Chapter Google Scholar
Evgeniou, A.A.T., Pontil, M.: Multi-task feature learning. In: Advances in Neural Information Processing Systems, vol. 19, pp. 41–48 (2007)
Google Scholar
Zhang, Y., Yeung, D.Y.: A Convex Formulation for Learning Task Relationships in Multi-Task Learning. In: Proceedings of the 26th Conference on Uncertainty in Artificial Intelligence, pp. 733–742 (2010)
Google Scholar
Munkres, J.: Algorithms for the assignment and transportation problems. Journal of the Society for Industrial and Applied Mathematics, 32–38 (1957)
Google Scholar
Gupta, A.K., Nagar, D.K.: Matrix variate distributions, vol. 104. Chapman & Hall/CRC (1999)
Google Scholar
Lovász, L., Plummer, M.: Matching theory. Elsevier Science Ltd. (1986)
Google Scholar
Fern, X., Brodley, C.: Random projection for high dimensional data clustering: A cluster ensemble approach. In: Proceedings of the 20th International Conference on Machine Learning, pp. 186–193 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
Liang Du, Yi-Dong Shen & Zhiwu Xu
Graduate University of Chinese Academy of Sciences, China
Liang Du, Yi-Dong Shen & Zhiwu Xu
University of Chinese Academy of Sciences, Beijing, 100049, China
Liang Du, Yi-Dong Shen & Zhiwu Xu
Baidu Inc., Beijing, 100085, China
Zhiyong Shen
Computing Center, Shanghai University, Shanghai, China
Jianying Wang

Authors

Liang Du
View author publications
You can also search for this author in PubMed Google Scholar
Yi-Dong Shen
View author publications
You can also search for this author in PubMed Google Scholar
Zhiyong Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jianying Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhiwu Xu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Technology, Tsinghua University, 100084, Beijing, China
Jianyong Wang
Management Science and Information Systems Department, Rutgers, the State University of New Jersey, 1, Washington Park, 07102, Newark, NJ, USA
Hui Xiong
Department of Information Engineering, Nagoya University, 464-8601, Nagoya, Japan
Yoshiharu Ishikawa
Department of Computer Science, Hong Kong Baptist University, Hong Kong
Jianliang Xu
School of Information Science and Engineering, Yanshan University, Qinhuangdao, China
Junfeng Zhou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Du, L., Shen, YD., Shen, Z., Wang, J., Xu, Z. (2013). A Self-Supervised Framework for Clustering Ensemble. In: Wang, J., Xiong, H., Ishikawa, Y., Xu, J., Zhou, J. (eds) Web-Age Information Management. WAIM 2013. Lecture Notes in Computer Science, vol 7923. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38562-9_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-38562-9_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38561-2
Online ISBN: 978-3-642-38562-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics