Abstract
Semi-supervised clustering aims at incorporating the known prior knowledge into the clustering process to achieve better performance. Recently, semi-supervised clustering with pairwise constraints has emerged as an important variant of the traditional clustering paradigm. In this paper, the disadvantages of pairwise constraints are analyzed in detail. To address these disadvantages, exemplars-constraints are firstly illustrated. Then based on the exemplars-constraints, a semi-supervised clustering framework is described step by step, and an exemplars-constraints EM algorithm is designed. Finally several UCI datasets are selected for experiments, and the experimental results show that exemplars-constraints can work well and the proposed algorithm can outperform the corresponding unsupervised clustering algorithm and the semi-supervised algorithms based on pairwise constraints.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: KDD, pp. 59–68 (2004)
Basu, S., Davidson, I., Wagstaff, K.L.: Constrained Clustering. CRC Press (2008)
Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML, pp. 81–88 (2004)
Chapelle, O., Zien, A., Scholkopf, B.: Semi-supervised learning. MIT Press (2006)
Eaton, E.R.: Clustering with Propagated Constraints. Thesis of the University of Maryland (2005)
Hoi, S.C.H., Jin, R., Lyu, M.R., Wu, J.: Learning nonparametric kernel matrices from pairwise constraints. In: ICML, pp. 361–368 (2007)
Huang, J., Sun, H.: Lightly-supervised clustering using pairwise constraint propagation. In: Proceedings of 2008 3rd International Conference on Intelligent System and Knowledge Engineering, pp. 765–770 (2008)
Zhou, Z.H., Tang, W.: Clusterer ensemble. Knowledge-Based Systems 19(1), 77–83 (2006)
Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: ICML, pp. 307–314 (2002)
Kulis, B., Basu, S., Dhillon, I., Mooney, R.J.: Semi-supervised graph clustering: a kernel approach. In: ICML, pp. 457–464 (2005)
Li, T., Ding, C., Jordan, M.: Solving Consensus and Semi-supervised Clustering Problems Using Nonnegative Matrix Factorization. In: ICDM, pp. 577–582 (2007)
Li, Z., Liu, J., Tang, X.: Pairwise constraint propagation by semidefinite programming for semi-supervised classification. In: ICML, pp. 576–583 (2008)
Masayuki, O., Seiji, Y.: Learning similarity matrix from constraints of relational neighbors. Journal of Advanced Computational Intelligence and Intelligent Informatics 14(4), 402–407 (2010)
Shental, N., Bar-Hillel, A., Hertz, T., Weinshall, D.: Computing gaussian mixture models with EM using equivalence constraints. In: NIPS, pp. 1–8 (2003)
Tang, W., Xiong, H., Zhong, S., Wu, J.: Enhancing semi-supervised clustering: A feature projection perspective. In: KDD, pp. 707–716 (2007)
Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: ICML, pp. 577–584 (2001)
Xuesong, Y., Songcan, C., Enliang, H.: Semi-supervised clustering with metric learning:an adaptive kernel method. Pattern Recognition 43(4), 1320–1333 (2010)
Yan, B., Domeniconi, C.: An Adaptive Kernel Method for Semi-supervised Clustering. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 521–532. Springer, Heidelberg (2006)
Yeung, D.Y., Chang, H.: A kernel approach for semi-supervised metric learning. IEEE Transactions on Neural Networks 18(1), 141–149 (2007)
Zhang, D., Chen, S., Zhou, Z., Yang, Q.: Constraint projections for ensemble learning. In: AAAI, pp. 758–763 (2008)
Khosla, M.: Message Passing Algorithms. PHD thesis, 9 (2009)
Frey, B.J., Dueck, D.: Clustering by passing messages between data points. Science 305(5814), 972–976 (2007)
Mzard, M.: Where are the exemplars? Science 315, 949–951 (2007)
Strehl, A., Ghosh, J.: Cluster Ensembles-A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research 3, 583–617 (2002)
Neal, R., Hinton, G.: A view of the EM algorithm that justifies incremental, sparse, and other variants. In: Learning in Graphical Models, pp. 355–368 (1998)
Sublemontier, J.H., Martin, L., Cleuziou, G., Exbrayat, M.: Integrating pairwise constraints into clustering algorithms: optimization-based approaches. In: The Eleventh IEEE International Conference on Data Mining Workshops, Vancouver, Canada (2011)
Zeng, H., Cheung, Y.M.: Semi-Supervised Maximum Margin Clustering with Pairwise Constraints. IEEE Transactions on Knowledge and Data Engineering 24, 926–939 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, H., Li, T., Li, T., Yang, Y. (2012). Exemplars-Constraints for Semi-supervised Clustering. In: Zhou, S., Zhang, S., Karypis, G. (eds) Advanced Data Mining and Applications. ADMA 2012. Lecture Notes in Computer Science(), vol 7713. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-35527-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-642-35527-1_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-35526-4
Online ISBN: 978-3-642-35527-1
eBook Packages: Computer ScienceComputer Science (R0)