Skip to main content
Log in

Semi-supervised multi-view binary learning for large-scale image clustering

  • Published:
Applied Intelligence Aims and scope Submit manuscript

Abstract

Large-scale image clustering has attracted sustained attention in machine learning. The traditional methods based on real value representation often suffer from the data storage and calculation. To deal with these problems, the methods based on the binary representation and the multi-view learning are introduced recently. However, how to improve the clustering performance is still a challenge. Considering that one can obtain in prior parts of labels in many cases, we further develop the label information in the multi-view binary learning. This information is beneficial to the design of the involved similarity matrix, which plays an important part in the clustering problem. As a result, a new method is proposed, i.e., Semi-supervised Multi-view Binary Learning(SMBL). It is tested by using four benchmark data sets and compared with several commonly used large-scale and semi-supervised clustering approaches. The extensive experimental results show that the proposed method achieves superior performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. https://archive.ics.uci.edu/ml/datasets/Multiple+Features

  2. http://www.vision.caltech.edu/Image_Datasets/Caltech101/Caltech101.html

  3. https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html

  4. https://www.cs.toronto.edu/kriz/cifar.html

  5. https://github.com/sepehrband/drawNemenyi

References

  1. Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data. Springer, pp 25–71

  2. Ng A Y, Jordan M I, Weiss Y (2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856

  3. Chao G (2019) Discriminative k-means laplacian clustering. Neural Process Lett 49(1):393–405

    Article  Google Scholar 

  4. Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv:1304.5634

  5. Fu L, Lin P, Vasilakos A V, Wang S (2020) An overview of recent multi-view clustering. Neurocomputing 402:148–161

    Article  Google Scholar 

  6. Chao G, Sun S, Bi J (2021) A survey on multi-view clustering. IEEE Transactions on Artificial Intelligence

  7. Li Y, Yang M, Zhang Z (2018) A survey of multi-view representation learning. IEEE Trans Knowl Data Eng 31(10):1863–1883

    Article  Google Scholar 

  8. Tzortzis G, Likas A (2009) Convex mixture models for multi-view clustering. In: International Conference on artificial neural networks. Springer, pp 205–214

  9. Tzortzis G F, Likas A C (2010) Multiple view clustering using a weighted combination of exemplar-based mixture models. IEEE Trans Neural Netw 21(12):1925–1938

    Article  Google Scholar 

  10. Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp 393–400

  11. Kumar A, Rai P, Daume H (2011) Co-regularized multi-view spectral clustering. Adv Neural Inf Process Syst 24:1413–1421

    Google Scholar 

  12. Gao H, Nie F, Li X, Huang H (2015) Multi-view subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 4238–4246

  13. Zhang X, Ren Z, Sun H, Bai K, Feng X, Liu Z (2021) Multiple kernel low-rank representation-based robust multi-view subspace clustering. Inf Sci 551:324–340

    Article  MathSciNet  Google Scholar 

  14. Liu J, Wang C, Gao J, Han J (2013) Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM international conference on data mining. SIAM, pp 252–260

  15. Yang Z, Liang N, Yan W, Li Z, Xie S (2020) Uniform distribution non-negative matrix factorization for multiview clustering. IEEE Trans Cybern 51(6):3249–3262

    Article  Google Scholar 

  16. Liu X, Dou Y, Yin J, Wang L, Zhu E (2016) Multiple kernel k-means clustering with matrix-induced regularization. In: Proceedings of the AAAI conference on artificial intelligence, vol 30

  17. Chaudhuri K, Kakade S M, Livescu K, Sridharan K (2009) Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th annual international conference on machine learning, pp 129–136

  18. Rasiwasia N, Mahajan D, Mahadevan V, Aggarwal G (2014) Cluster canonical correlation analysis. In: Artificial intelligence and statistics. PMLR, pp 823–831

  19. Cao X, Zhang C, Fu H, Liu S, Zhang H (2015) Diversity-induced multi-view subspace clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–594

  20. Liu X, Ji S, Glänzel W, De Moor B (2012) Multiview partitioning via tensor methods. IEEE Trans Knowl Data Eng 25(5):1056–1069

    Google Scholar 

  21. Chao G, Sun J, Lu J, Wang A-L, Langleben D D, Li C-S, Bi J (2019) Multi-view cluster analysis with incomplete data to understand treatment effects. Inf Sci 494:278–293

    Article  MathSciNet  Google Scholar 

  22. Liu X, Li M, Tang C, Xia J, Xiong J, Liu L, Kloft M, Zhu E (2020) Efficient and effective regularized incomplete multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence

  23. Yin J, Sun S (2021) Incomplete multi-view clustering with reconstructed views. IEEE Trans Knowl Data Eng

  24. Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fusion 38:43–54

    Article  Google Scholar 

  25. Cai X, Nie F, Huang H (2013) Multi-view k-means clustering on big data. In: Twenty-Third International Joint Conference on Artificial Intelligence

  26. Li Y, Nie F, Huang H, Huang J (2015) Large-scale multi-view spectral clustering via bipartite graph. In: Twenty-ninth AAAI conference on artificial intelligence

  27. Kang Z, Zhou W, Zhao Z, Shao J, Han M, Xu Z (2020) Large-scale multi-view subspace clustering in linear time. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4412–4419

  28. Wang J, Zhang T, Sebe N, Shen H T, et al. (2017) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 40(4):769–790

    Article  Google Scholar 

  29. Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2075–2082

  30. Liu X, Hu Z, Ling H, Cheung Y-M (2021) Mtfh: A matrix tri-factorization hashing framework for efficient cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 43(3):964–981

    Article  Google Scholar 

  31. Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell 43(4):1445–1451

    Article  Google Scholar 

  32. Zhang Z, Liu L, Shen F, Shen H T, Shao L (2018) Binary multi-view clustering. IEEE Trans Pattern Anal Mach Intell 41(7):1774–1782

    Article  Google Scholar 

  33. Chao G, Sun S (2019) Semi-supervised multi-view maximum entropy discrimination with expectation laplacian regularization. Inf Fusion 45:296–306

    Article  Google Scholar 

  34. Liang N, Yang Z, Li Z, Xie S, Su C-Y (2020) Semi-supervised multi-view clustering with graph-regularized partially shared non-negative matrix factorization. Knowl-Based Syst 190:105185

    Article  Google Scholar 

  35. Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on Pattern Analysis and Machine Intelligence

  36. Śmieja M, Struski L, Figueiredo MAT (2020) A classification-based approach to semi-supervised clustering with pairwise constraints. Neural Netw 127:193–203

    Article  Google Scholar 

  37. Gong Y, Lazebnik S, Gordo A, Perronnin F (2012) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35 (12):2916–2929

    Article  Google Scholar 

  38. Hartigan J A, Wong M A (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108

    MATH  Google Scholar 

  39. Likas A, Vlassis N, Verbeek J J (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461

    Article  Google Scholar 

  40. Lee D D, Seung H S (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791

    Article  Google Scholar 

  41. Ding C, He X, Simon H D (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining. SIAM, pp 606–610

  42. Weiss Y, Torralba A, Fergus R, et al. (2008) Spectral hashing. In: Advances in neural information processing systems, vol 1. Citeseer, p 4

  43. Gionis A, Indyk P, Motwani R, et al. (1999) Similarity search in high dimensions via hashing. In: Vldb, vol 99, pp 518–529

  44. Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406

    Article  Google Scholar 

  45. Baluja S, Covell M (2008) Learning to hash: forgiving hash functions and applications. Data Min Knowl Disc 17(3):402–430

    Article  MathSciNet  Google Scholar 

  46. Zhang Z, Liu L, Qin J, Zhu F, Shen F, Xu Y, Shao L, Shen H T (2018) Highly-economized multi-view binary compression for scalable image clustering. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 717–732

  47. Shen H T, Liu L, Yang Y, Xu X, Huang Z, Shen F, Hong R (2020) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng

  48. Weinberger K Q, Saul L K (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(2)

  49. Sun S, Chen Q (2011) Hierarchical distance metric learning for large margin nearest neighbor classification. Int J Pattern Recognit Artif Intell 25(07):1073–1087

    Article  MathSciNet  Google Scholar 

  50. Nie F, Li J, Li X, et al. (2016) Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification.. In: IJCAI, pp 1881–1887

  51. Cai D, Chen X (2014) Large scale spectral clustering via landmark-based sparse representation. IEEE Trans Cybern 45(8):1669–1680

    Google Scholar 

  52. Demiar J, Schuurmans D (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30

    MathSciNet  Google Scholar 

Download references

Acknowledgements

This work was supported in part by the Key-Area Research and Development Program of Guangdong Province under Grants 2019B010154002, 2019B010118001, and 2019B010121001; in part by the National Natural Science Foundation of China under Grants 61803096, 61801133, and U191140003; in part by the Guangzhou Science and Technology Program Project under Grant 202002030289; in part by the Natural Science Foundation of Guangdong Province under Grant 2020A1515010768 and Grant 2022A1515010688.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wei Han.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article belongs to the Topical Collection: Special Issue on Multi-view Learning Guest Editors: Guoqing Chao, Xingquan Zhu, Weiping Ding, Jinbo Bi and Shiliang Sun

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, M., Yang, Z., Han, W. et al. Semi-supervised multi-view binary learning for large-scale image clustering. Appl Intell 52, 14853–14870 (2022). https://doi.org/10.1007/s10489-022-03205-z

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10489-022-03205-z

Keywords

Navigation