Abstract
In this paper, we adapt Tuy’s concave cutting plane method to the semi-supervised clustering. We also give properties of local optimal solutions of the semi-supervised clustering. Numerical examples show that this method can give a better solution than other semi-supervised clustering algorithms do.
Similar content being viewed by others
References
Bar-Hillel A, Hertz T, Shental N, Weinshall D (2005) Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research 6: 937–965
Basu S, Banerjee A, Mooney RJ (2003) Semi-supervised clustering by seeding. In: Sammut C, Hoffmann AG (eds) ICML: Machine learning, proceedings of the nineteenth international conference (ICML 2002). University of New South Wales, Sydney, Australia, Morgan Kaufmann, July 8–12, 2002, pp 27–34
Basu S, Bilenko M, Mooney RJ (2004) A probabilistic framework for semi-supervised clustering. In: Kim W, Kohavi R, Gehrke J, DuMouchel W (eds) KDD: proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining. Seattle, Washington, USA, ACM, August 22–25, pp 59–68
Bilenko M, Basu S, Mooney RJ (2004) Integrating constraints and metric learning in semi-supervised clustering. In: ICML’04: proceedings of the twenty-first international conference on machine learning. ACM Press, New York, NY, USA, p 11
Chang H, Yeung D-Y (2006) Locally linear metric adaptation with application to semi-supervised clustering and image retrieval. Pattern Recognition 39(7): 1253–1264
Cohn D, Caruana R, McCallum A (2003) Semi-supervised clustering with user feedback. Technical report, Cornell University
Davidson I, Ravi SS (2005) Clustering with constraints: feasibility issues and the k-means algorithm. In: Proceedings of the 2005 SIAM international conference on data mining
Demiriz A, Bennett KP, Embrechts MJ (1999) Semi-supervised clustering using genetic algorithms. In: Proceedings of ANNIE’99 (Artificial Neural Networks in Engineering). R.P.I. Math Report No. 9901, Rensselaer Polytechnic Institute, Troy, New York
Drineas P, Frieze AM, Kannan R, Vempala S, Vinay V (2004) Clustering large graphs via the singular value decomposition. Mach Learn 56(1–3): 9–33
Forrest JJ, Goldfarb D (1992) Steepest-edge simplex algorithms for linear programming. Math Programming 57(3, Ser. A): 341–374
Forrest JJH, Tomlin JA (1992) Implementing the simplex method for the optimization subroutine library. IBM Syst J 31(1): 11–25
Freund RW, Jarre F (1997) A QMR-based interior-point algorithm for solving linear programs. Math Program 76(1, Ser. B):183–210. Interior point methods in theory and practice (Iowa City, IA, 1994)
Gao J, Tan P-N, Cheng H (2006) Semi-supervised clustering with partial background information. In: Ghosh J, Lambert D, Skillicorn DB, Srivastava J (eds) SDM’06: proceedings of the sixth SIAM international conference on data mining. SIAM, Bethesda, MD, USA, April 20–22
Gordon AD (1996) A survey of constrained classification. Comput Stat Data Anal 21(1): 17–29
Horst R, Tuy H (1993) Global optimization. Springer-Verlag, Berlin
Jain AK, Mallapragada PK, Law M (2006) Bayesian feedback in data clustering. In: ICPR. IEEE Computer Society, pp 374–378
Klein D, Kamvar SD, Manning CD (2002) From instance-level constraints to space-level constraints: making the most of prior knowledge in data clustering. In: Proceedings of the international conference on machine learning
Lange T, Law MHC, Jain AK, Buhmann JM (2005) Learning with constrained and unlabelled data. In: CVPR’05: proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), vol 1. IEEE Computer Society, Washington, DC, USA, pp 731–738
Murphy PM, Aha DW (1994) UCI repository of machine learning databases. Technical report, University of California, Department of Information and Computer Science, Irvine, CA. http://www.ics.uci.edu/~mlearn/MLRepository.html
Nemhauser GL, Wolsey LA (1988) Integer and combinatorial optimization Wiley-Interscience Series in Discrete Mathematics and Optimization. Wiley, A Wiley-Interscience Publication, New York
Nesterov Y (2004) Introductory lectures on convex optimization, volume 87 of Applied Optimization. Kluwer Academic Publishers, Boston, MA (A basic course)
Nesterov Y, Nemirovskii A (1994) Interior-point polynomial algorithms in convex programming, volume 13 of SIAM studies in applied mathematics. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA
Shental N, Bar-Hillel A, Hertz T, Weinshall D (2003) Computing Gaussian mixture models with EM using equivalence constraints. In: Thrun S, Saul LK, Schölkopf B (eds) NIPS. MIT Press
Tuy H (1964) Concave programming under linear constraints. Soviet Math 5: 1437–1440
Wagstaff K, Cardie C (2000) Clustering with instance-level constraints. In: Proceedings of the 17th international conference on machine learning. Morgan Kaufmann, San Francisco, CA, pp 1103–1110
Wagstaff K, Cardie C, Rogers S, Schroedl S (2001) Constrained k-means clustering with background knowledge. In: ICML’01: proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp 577–584
Xia Y (2007) Constrained clustering via concavity cuts. In: CPAIOR’07: proceedings of the fourth international conference on integration of AI and OR techniques in constraint programming for combinatorial optimization problems, pp 318–331. http://dx.doi.org/10.1007/978-3-540-72397-4_23, http://dblp.uni-trier.de
Xia Y, Peng J (2005) A cutting algorithm for the minimum sum-of-squared error clustering. In: Proceedings of the fifth SIAM international conference on data mining, pp 150–160
Xing EP, Ng AY, Jordan MI, Russell S (2002) Distance metric learning with application to clustering with side-information. In: Thrun S, Becker S, Obermayer K (eds) Advances in neural information processing systems, vol 15. MIT Press, Cambridge, MA, pp 505–512
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Charu Aggarwal.
Rights and permissions
About this article
Cite this article
Xia, Y. A global optimization method for semi-supervised clustering. Data Min Knowl Disc 18, 214–256 (2009). https://doi.org/10.1007/s10618-008-0104-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-008-0104-3