Skip to main content
Log in

Pairwise Constrained Fuzzy Clustering: Relation, Comparison and Parallelization

  • Published:
International Journal of Fuzzy Systems Aims and scope Submit manuscript

Abstract

Although clustering with pairwise constraints through penalty regularization has been widely adopted in existing semi-supervised clustering approaches, little work has been done on theoretical comparison of these pairwise constrained approaches with respect to difference in penalties. In this paper, we first propose two types of penalties in the context of pairwise constrained fuzzy clustering. The first one accounts for the overall consistency of assignments in terms of fuzzy memberships regarding constrained pairs. The second one is the total Euclidean distance between membership vectors of constrained pairs. After analytical discussion, we establish the connection between different penalties to provide a unified view as well as a better understanding of each of them. Following the idea of penalty regularization, variants of pairwise constrained fuzzy c-means are formulated by incorporating the consistency-type and distance-type penalties, respectively, into the objective function of fuzzy c-means. We also extend this idea to co-clustering by considering pairwise constraints of two types of objects to produce fuzzy co-clusters. Efficient and scalable algorithms have been proposed for parallel implementation. The experimental results with real-world datasets show good performance of the proposed approaches with respect to effectiveness, efficiency and scalability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Notes

  1. http://www.ics.uci.edu/mlearn/MLRepository.html.

  2. Available at http://www.cse.fau.edu/zhong/pubs.htm.

  3. http://trec.nist.gov.

  4. http://www.daviddlewis.com/resources/testcollections.

  5. http://glaros.dtc.umn.edu/gkhome/.

  6. http://archive.ics.uci.edu/ml/datasets/US+Census+Data+(1990).

  7. http://spark.apache.org/.

References

  1. Majhi, S.K., Bhatachharya, S., Pradhan, R., Biswal, S.: Fuzzy clustering using salp swarm algorithm for automobile insurance fraud detection. J. Intell. Fuzzy Syst. 36(3), 2333–2344 (2019)

    Article  Google Scholar 

  2. Thao, N.X., Ali, M., Smarandache, F.: An intuitionistic fuzzy clustering algorithm based on a new correlation coefficient with application in medical diagnosis. J. Intell. Fuzzy Syst. 36(1), 189–198 (2019)

    Article  Google Scholar 

  3. Wan, Y., Zhong, Y., Ma, A.: Fully automatic spectral-spatial fuzzy clustering using an adaptive multiobjective memetic algorithm for multispectral imagery. IEEE Trans. Geosci. Remote Sens. 57(4), 2324–2340 (2019)

    Article  Google Scholar 

  4. Wagstaff, K., Cardie, C., Rogers, S., Schrodl, S.: Constrained k-means clustering with background knowledge. In: International Conference on Machine Learning, pp. 577–584 (2001)

  5. Basu, S., Bilenko, M., Mooney, R.J.: A probabilistic framework for semi-supervised clustering. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2004)

  6. Grira, N., Crucianu, M., Boujemaa, N.: Semi-supervised fuzzy clustering with pairwise-constrained competitive agglomeration. In: IEEE International Conference on Fuzzy Systems, pp. 867–872 (2005)

  7. Kummamuru, K., Dhawale, A., Krishnapuram, R.: Fuzzy co-clustering of documents and keywords. In: 12th IEEE International Conference on Fuzzy Systems (2003)

  8. Mei, J.-P., Chen, L.: Proximity-based k-partitions clustering with ranking for document categorization and analysis. Expert Syst. Appl. 41(16), 7095–7105 (2014)

    Article  Google Scholar 

  9. Pedrycz, W., Waletzky, J.: Fuzzy clustering with partial supervision. IEEE Trans. Syst. Man Cybernet. 27(5), 787–795 (1997)

    Article  Google Scholar 

  10. Yasunori, E., Yukihiro, H., Makito, Y.: “On semi-supervised fuzzy c-means clustering,” In: IEEE International Conference on Fuzzy Systems, pp. 1119–1124 (2009)

  11. Mai, D.S., Ngo, L.T.: Semi-supervised fuzzy c-means clustering for change detection from multispectral satellite image. In: IEEE International Conference on Fuzzy Systems (2013)

  12. Marek, S., Oleksandr, M., Jacek, T.: Semi-supervised discriminative clustering with graph regularization. Knowl. Based Syst. 151, 24–36 (2018)

    Article  Google Scholar 

  13. Grira, N., Crucianu, M., Boujemaa, N.: Active semi-supervised fuzzy clustering. Pattern Recognition 41(5), 1834–1844 (2008)

    Article  MATH  Google Scholar 

  14. Frigui, H., Hwang, C.: Fuzzy clustering and aggregation of relational data with instance-level constraints. IEEE Trans. Fuzzy Syst. 16(6), 1565–1581 (2008)

    Article  Google Scholar 

  15. de Melo, F.M., de A.T. de Carvalho, F.: Semi-supervised fuzzy c-medoids clustering algorithm with multiple prototype representation. In: IEEE International Conference on Fuzzy Systems (2013)

  16. Yan, Y., Chen, L.: Fuzzy semi-supervised co-clustering for text documents. Fuzzy Sets Syst. 215, 74–89 (2013)

    Article  MathSciNet  Google Scholar 

  17. Bouchachia, A., Pedrycz, W.: Data clustering with partial supervision. Data Min. Knowl. Discov. 12(1), 47–78 (2006)

    Article  MathSciNet  MATH  Google Scholar 

  18. Yin, X., Shu, T., Huang, Q.: Semi-supervised fuzzy clustering with metric learning and entropy regularization. Knowl. Based Syst. 35(15), 304–311 (2012)

    Article  Google Scholar 

  19. Lai, D.T.C., Garibaldi, J.M., Reps, J.: Investigating distance metric learning in semi-supervised fuzzy c-means clustering. In: IEEE International Conference on Fuzzy Systems (2014)

  20. Chang, S., Aggarwal, C., Huang, T.: Learning local semantic distances with limited supervision. In: IEEE International Conference on Data Mining, pp. 70–79 (2014)

  21. Diaz-Valenzuela, I., Vila, M.A., Martin-Bautista, M.J.: On the use of fuzzy constraints in semisupervised clustering. IEEE Trans. Fuzzy Syst. 24(4), 992–999 (2016)

    Article  Google Scholar 

  22. Ding, S., Jia, H., Du, M., Xue, Y.: A semi-supervised approximate spectral clustering algorithm based on HMRF model. Inf. Sci. 429, 215–228 (2018)

    Article  MathSciNet  Google Scholar 

  23. Kanzawa, Y., Endo, Y., Miyamoto, S.: Some pairwise constrained semi-supervised fuzzy c-means clustering algorithms. In: International Conference on Modeling Decisions for Artificial Intelligence (2009)

  24. Mei, J.-P., Chen, L.: Fuzzy clustering with weighted medoids for relational data. Pattern Recognit. 43(5), 1964–1974 (2010)

    Article  MATH  Google Scholar 

  25. Zhao, W., Ma, H., He, Q.: Parallel k-means clustering based on mapreduce. In: International Conference on Cloud Computing, pp. 674–679 (2009)

  26. Yang, Y., Teng, F., Li, T., Wang, H., Wang, H., Zhang, Q.: Parallel semi-supervised multi-ant colonies clustering ensemble based on mapreduce methodology. IEEE Trans. Cloud Comput. 6(3), 857–867 (2018)

    Article  Google Scholar 

  27. Chen, J., Li, K., Tang, Z., Bilal, K., Yu, S., Weng, C., Li, K.: A parallel random forest algorithm for big data in a spark cloud computing environment. IEEE Trans. Parallel Distrib. Syst. 28(4), 919–933 (2017)

    Article  Google Scholar 

  28. Lu, M., Zhao, X.-J., Zhang, L., Li, F.: Semi-supervised concept factorization for document clustering. Inf. Sci. 331, 86–98 (2016)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant No. 61502420), and the Zhejiang Provincial Natural Science Foundation of China (Grant No. LY16F020032).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jian-Ping Mei.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mei, JP., Lv, H., Cao, J. et al. Pairwise Constrained Fuzzy Clustering: Relation, Comparison and Parallelization. Int. J. Fuzzy Syst. 21, 1938–1949 (2019). https://doi.org/10.1007/s40815-019-00683-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s40815-019-00683-1

Keywords

Navigation