Skip to main content

Abstract

In this paper, we adapt Tuy’s concave cutting plane method to the problem of finding an optimal grouping of semi-supervised clustering. We also give properties of local optimal solutions to the semi-supervised clustering. On test data sets with up to 1500 points, our algorithm typically find a solution with objective value around 2% smaller of the initial function value than that obtained by k-means algorithm within 4 seconds, although the run time is hundred times of that of the k-means algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning a mahalanobis metric from equivalence constraints. Journal of Machine Learning Research 6, 937–965 (2005)

    MathSciNet  Google Scholar 

  2. Basu, S., Davidson, I.: Clustering with constraints: Theory and practice. Online Proceedings of a KDD tutorial (2006), http://www.ai.sri.com/~basu/kdd-tutorial-2006/

  3. Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: ICML ’04: Proceedings of the twenty-first international conference on Machine learning, ACM Press, New York (2004)

    Google Scholar 

  4. Bradley, P.S., Bennett, K.P., Demiriz, A.: Constrained k-means clustering. Technical Report MSR-TR-2000-65, Microsoft Research (2000)

    Google Scholar 

  5. Gordon, A.D.: A survey of constrained classification. Comput. Statist. Data Anal. 21(1), 17–29 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  6. Horst, R., Tuy, H.: Global optimization. Springer, Berlin (1993)

    Google Scholar 

  7. Jain, A.K., Dubes, R.C.: Algorithms for clustering data. Prentice Hall Advanced Reference Series. Prentice Hall Inc., Englewood Cliffs (1988)

    MATH  Google Scholar 

  8. Kanungo, T., Mount, D.M., Netanyahu, N.S., Piatko, C.D., Silverman, R., Wu, A.Y.: A local search approximation algorithm for k-means clustering. Comput. Geom. Theory Appl. 28(2-3), 89–112 (2004)

    MATH  MathSciNet  Google Scholar 

  9. Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: ICML, pp. 307–314 (2002)

    Google Scholar 

  10. Lange, T., Law, M.H.C., Jain, A.K., Buhmann, J.M.: Learning with constrained and unlabelled data. In: CVPR ’05: Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) - Volume 1, Washington, DC, USA, pp. 731–738. IEEE Computer Society Press, Los Alamitos (2005)

    Google Scholar 

  11. Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases. Technical report, University of California, Department of Information and Computer Science, Irvine, CA (1994), http://www.ics.uci.edu/~mlearn/MLRepository.html

  12. Tuy, H.: Concave programming under linear constraints. Soviet Mathematics 5, 1437–1440 (1964)

    Google Scholar 

  13. Wagstaff, K., Cardie, C., Rogers, S., Schroedl, S.: Constrained k-means clustering with background knowledge. In: ICML ’01: Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577–584. Morgan Kaufmann Publishers Inc, San Francisco (2001)

    Google Scholar 

  14. Xia, Y., Peng, J.: A cutting algorithm for the minimum sum-of-squared error clustering. In: Proceedings of the Fifth SIAM International Conference on Data Mining, pp. 150–160 (2005)

    Google Scholar 

  15. Xing, E.P., Ng, A.Y., Jordan, M.I., Russell, S.: Distance metric learning with application to clustering with side-information. In: Thrun, S., Becker, S., Obermayer, K. (eds.) Advances in Neural Information Processing Systems 15, pp. 505–512. MIT Press, Cambridge (2002)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Pascal Van Hentenryck Laurence Wolsey

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Xia, Y. (2007). Constrained Clustering Via Concavity Cuts. In: Van Hentenryck, P., Wolsey, L. (eds) Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems. CPAIOR 2007. Lecture Notes in Computer Science, vol 4510. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72397-4_23

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72397-4_23

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72396-7

  • Online ISBN: 978-3-540-72397-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics