Skip to main content

Graph-Based Clustering with Constraints

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2011)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6635))

Included in the following conference series:

Abstract

A common way to add background knowledge to the clustering algorithms is by adding constraints. Though there had been some algorithms that incorporate constraints into the clustering process, not much focus was given to the topic of graph-based clustering with constraints. In this paper, we propose a constrained graph-based clustering method and argue that adding constraints in distance function before graph partitioning will lead to better results. We also specify a novel approach for adding constraints by introducing the distance limit criteria. We will also examine how our new distance limit approach performs in comparison to earlier approaches of using fixed distance measure for constraints. The proposed approach and its variants are evaluated on UCI datasets and compared with the other constrained-clustering algorithms which embed constraints in a similar fashion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Basu, S., Banerjee, A., Mooney, R.J.: Semi-supervised clustering by seeding. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), pp. 27–34 (2002)

    Google Scholar 

  2. Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 59–68 (2004)

    Google Scholar 

  3. Basu, S., Davidson, I., Wagstaff, K.: Constrained Clustering: Advances in Algorithms, Theory, and Applications. Chapman & Hall/CRC (2008)

    Google Scholar 

  4. Bilenko, M., Basu, S., Mooney, R.J.: Integrating constraints and metric learning in semi-supervised clustering. In: Proceedings of the Twenty-first International Conference on Machine Learning, ICML 2004 (2004)

    Google Scholar 

  5. Davidson, I., Ravi, S.S.: Agglomerative Hierarchical Clustering with Constraints: Theoretical and Empirical Results. In: Jorge, A.M., Torgo, L., Brazdil, P.B., Camacho, R., Gama, J. (eds.) PKDD 2005. LNCS (LNAI), vol. 3721, pp. 59–70. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  6. Davidson, I., Ravi, S.S., Shamis, L.: A sat-based framework for efficient constrained clustering. In: Jonker, W., Petković, M. (eds.) SDM 2010. LNCS, vol. 6358, pp. 94–105. Springer, Heidelberg (2010)

    Google Scholar 

  7. Davidson, I., Wagstaff, K.L., Basu, S.: Measuring Constraint-Set Utility for Partitional Clustering Algorithms. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) PKDD 2006. LNCS (LNAI), vol. 4213, pp. 115–126. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  8. Frank, A., Asuncion, A.: UCI machine learning repository (2010), http://archive.ics.uci.edu/ml

  9. Gunopulos, D., Vazirgiannis, M., Halkidi, M.: From unsupervised to semi-supervised learning: Algorithms and evaluation approaches. In: SIAM International Conference on Data Mining: Tutorial (2006)

    Google Scholar 

  10. Halkidi, M., Gunopulos, D., Kumar, N., Vazirgiannis, M., Domeniconi, C.: A framework for semi-supervised learning based on subjective and objective clustering criteria. In: Proceedings of the 5th IEEE International Conference on Data Mining (ICDM 2005), pp. 637–640 (2005)

    Google Scholar 

  11. Karypis, G., Han, E.-H., Kumar, V.: Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32(8), 68–75 (1999)

    Article  Google Scholar 

  12. Karypis, G., Kumar, V.: Metis 4.0: Unstructured graph partitioning and sparse matrix ordering system. Tech. Report, Dept. of Computer Science, Univ. of Minnesota (1998)

    Google Scholar 

  13. Klein, D., Kamvar, S.D., Manning, C.D.: From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In: Proceedings of the Nineteenth International Conference on Machine Learning (ICML 2002), pp. 307–314 (2002)

    Google Scholar 

  14. Kulis, B., Basu, S., Dhillon, I.S., Mooney, R.J.: Semi-supervised graph clustering: a kernel approach. In: Proceedings of the Twenty-Second International Conference on Machine Learning (ICML 2005), pp. 457–464 (2005)

    Google Scholar 

  15. Lelis, L., Sander, J.: Semi-supervised density-based clustering. In: Perner, P. (ed.) ICDM 2009. LNCS, vol. 5633, pp. 842–847. Springer, Heidelberg (2009)

    Google Scholar 

  16. Rand, W.M.: Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66(336), 846–850 (1971)

    Article  Google Scholar 

  17. Ruiz, C., Spiliopoulou, M., Menasalvas, E.: Density based semi-supervised clustering. Data Mining and Knowledge Discovery 21(3), 345–370 (2009)

    Article  Google Scholar 

  18. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining, US edition. Addison Wesley, Reading (2005)

    Google Scholar 

  19. Wagstaff, K., Cardie, C., Rogers, S., Schrödl, S.: Constrained k-means clustering with background knowledge. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), pp. 577–584 (2001)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Anand, R., Reddy, C.K. (2011). Graph-Based Clustering with Constraints. In: Huang, J.Z., Cao, L., Srivastava, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2011. Lecture Notes in Computer Science(), vol 6635. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-20847-8_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-20847-8_5

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-20846-1

  • Online ISBN: 978-3-642-20847-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics