Abstract
Large-scale image clustering has attracted sustained attention in machine learning. The traditional methods based on real value representation often suffer from the data storage and calculation. To deal with these problems, the methods based on the binary representation and the multi-view learning are introduced recently. However, how to improve the clustering performance is still a challenge. Considering that one can obtain in prior parts of labels in many cases, we further develop the label information in the multi-view binary learning. This information is beneficial to the design of the involved similarity matrix, which plays an important part in the clustering problem. As a result, a new method is proposed, i.e., Semi-supervised Multi-view Binary Learning(SMBL). It is tested by using four benchmark data sets and compared with several commonly used large-scale and semi-supervised clustering approaches. The extensive experimental results show that the proposed method achieves superior performance.
Similar content being viewed by others
References
Berkhin P (2006) A survey of clustering data mining techniques. In: Grouping multidimensional data. Springer, pp 25–71
Ng A Y, Jordan M I, Weiss Y (2002) On spectral clustering: Analysis and an algorithm. In: Advances in neural information processing systems, pp 849–856
Chao G (2019) Discriminative k-means laplacian clustering. Neural Process Lett 49(1):393–405
Xu C, Tao D, Xu C (2013) A survey on multi-view learning. arXiv:1304.5634
Fu L, Lin P, Vasilakos A V, Wang S (2020) An overview of recent multi-view clustering. Neurocomputing 402:148–161
Chao G, Sun S, Bi J (2021) A survey on multi-view clustering. IEEE Transactions on Artificial Intelligence
Li Y, Yang M, Zhang Z (2018) A survey of multi-view representation learning. IEEE Trans Knowl Data Eng 31(10):1863–1883
Tzortzis G, Likas A (2009) Convex mixture models for multi-view clustering. In: International Conference on artificial neural networks. Springer, pp 205–214
Tzortzis G F, Likas A C (2010) Multiple view clustering using a weighted combination of exemplar-based mixture models. IEEE Trans Neural Netw 21(12):1925–1938
Kumar A, Daumé H (2011) A co-training approach for multi-view spectral clustering. In: Proceedings of the 28th International Conference on Machine Learning (ICML-11), pp 393–400
Kumar A, Rai P, Daume H (2011) Co-regularized multi-view spectral clustering. Adv Neural Inf Process Syst 24:1413–1421
Gao H, Nie F, Li X, Huang H (2015) Multi-view subspace clustering. In: Proceedings of the IEEE international conference on computer vision, pp 4238–4246
Zhang X, Ren Z, Sun H, Bai K, Feng X, Liu Z (2021) Multiple kernel low-rank representation-based robust multi-view subspace clustering. Inf Sci 551:324–340
Liu J, Wang C, Gao J, Han J (2013) Multi-view clustering via joint nonnegative matrix factorization. In: Proceedings of the 2013 SIAM international conference on data mining. SIAM, pp 252–260
Yang Z, Liang N, Yan W, Li Z, Xie S (2020) Uniform distribution non-negative matrix factorization for multiview clustering. IEEE Trans Cybern 51(6):3249–3262
Liu X, Dou Y, Yin J, Wang L, Zhu E (2016) Multiple kernel k-means clustering with matrix-induced regularization. In: Proceedings of the AAAI conference on artificial intelligence, vol 30
Chaudhuri K, Kakade S M, Livescu K, Sridharan K (2009) Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th annual international conference on machine learning, pp 129–136
Rasiwasia N, Mahajan D, Mahadevan V, Aggarwal G (2014) Cluster canonical correlation analysis. In: Artificial intelligence and statistics. PMLR, pp 823–831
Cao X, Zhang C, Fu H, Liu S, Zhang H (2015) Diversity-induced multi-view subspace clustering. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 586–594
Liu X, Ji S, Glänzel W, De Moor B (2012) Multiview partitioning via tensor methods. IEEE Trans Knowl Data Eng 25(5):1056–1069
Chao G, Sun J, Lu J, Wang A-L, Langleben D D, Li C-S, Bi J (2019) Multi-view cluster analysis with incomplete data to understand treatment effects. Inf Sci 494:278–293
Liu X, Li M, Tang C, Xia J, Xiong J, Liu L, Kloft M, Zhu E (2020) Efficient and effective regularized incomplete multi-view clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence
Yin J, Sun S (2021) Incomplete multi-view clustering with reconstructed views. IEEE Trans Knowl Data Eng
Zhao J, Xie X, Xu X, Sun S (2017) Multi-view learning overview: recent progress and new challenges. Inf Fusion 38:43–54
Cai X, Nie F, Huang H (2013) Multi-view k-means clustering on big data. In: Twenty-Third International Joint Conference on Artificial Intelligence
Li Y, Nie F, Huang H, Huang J (2015) Large-scale multi-view spectral clustering via bipartite graph. In: Twenty-ninth AAAI conference on artificial intelligence
Kang Z, Zhou W, Zhao Z, Shao J, Han M, Xu Z (2020) Large-scale multi-view subspace clustering in linear time. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 4412–4419
Wang J, Zhang T, Sebe N, Shen H T, et al. (2017) A survey on learning to hash. IEEE Trans Pattern Anal Mach Intell 40(4):769–790
Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2075–2082
Liu X, Hu Z, Ling H, Cheung Y-M (2021) Mtfh: A matrix tri-factorization hashing framework for efficient cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 43(3):964–981
Yan C, Gong B, Wei Y, Gao Y (2020) Deep multi-view enhancement hashing for image retrieval. IEEE Trans Pattern Anal Mach Intell 43(4):1445–1451
Zhang Z, Liu L, Shen F, Shen H T, Shao L (2018) Binary multi-view clustering. IEEE Trans Pattern Anal Mach Intell 41(7):1774–1782
Chao G, Sun S (2019) Semi-supervised multi-view maximum entropy discrimination with expectation laplacian regularization. Inf Fusion 45:296–306
Liang N, Yang Z, Li Z, Xie S, Su C-Y (2020) Semi-supervised multi-view clustering with graph-regularized partially shared non-negative matrix factorization. Knowl-Based Syst 190:105185
Bai L, Liang J, Cao F (2020) Semi-supervised clustering with constraints of different types from multiple information sources. IEEE Transactions on Pattern Analysis and Machine Intelligence
Śmieja M, Struski L, Figueiredo MAT (2020) A classification-based approach to semi-supervised clustering with pairwise constraints. Neural Netw 127:193–203
Gong Y, Lazebnik S, Gordo A, Perronnin F (2012) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35 (12):2916–2929
Hartigan J A, Wong M A (1979) Algorithm as 136: A k-means clustering algorithm. J R Stat Soc Ser C (Appl Stat) 28(1):100–108
Likas A, Vlassis N, Verbeek J J (2003) The global k-means clustering algorithm. Pattern Recogn 36(2):451–461
Lee D D, Seung H S (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Ding C, He X, Simon H D (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In: Proceedings of the 2005 SIAM international conference on data mining. SIAM, pp 606–610
Weiss Y, Torralba A, Fergus R, et al. (2008) Spectral hashing. In: Advances in neural information processing systems, vol 1. Citeseer, p 4
Gionis A, Indyk P, Motwani R, et al. (1999) Similarity search in high dimensions via hashing. In: Vldb, vol 99, pp 518–529
Wang J, Kumar S, Chang S-F (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393–2406
Baluja S, Covell M (2008) Learning to hash: forgiving hash functions and applications. Data Min Knowl Disc 17(3):402–430
Zhang Z, Liu L, Qin J, Zhu F, Shen F, Xu Y, Shao L, Shen H T (2018) Highly-economized multi-view binary compression for scalable image clustering. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 717–732
Shen H T, Liu L, Yang Y, Xu X, Huang Z, Shen F, Hong R (2020) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng
Weinberger K Q, Saul L K (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10(2)
Sun S, Chen Q (2011) Hierarchical distance metric learning for large margin nearest neighbor classification. Int J Pattern Recognit Artif Intell 25(07):1073–1087
Nie F, Li J, Li X, et al. (2016) Parameter-free auto-weighted multiple graph learning: a framework for multiview clustering and semi-supervised classification.. In: IJCAI, pp 1881–1887
Cai D, Chen X (2014) Large scale spectral clustering via landmark-based sparse representation. IEEE Trans Cybern 45(8):1669–1680
Demiar J, Schuurmans D (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(1):1–30
Acknowledgements
This work was supported in part by the Key-Area Research and Development Program of Guangdong Province under Grants 2019B010154002, 2019B010118001, and 2019B010121001; in part by the National Natural Science Foundation of China under Grants 61803096, 61801133, and U191140003; in part by the Guangzhou Science and Technology Program Project under Grant 202002030289; in part by the Natural Science Foundation of Guangdong Province under Grant 2020A1515010768 and Grant 2022A1515010688.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Special Issue on Multi-view Learning Guest Editors: Guoqing Chao, Xingquan Zhu, Weiping Ding, Jinbo Bi and Shiliang Sun
Rights and permissions
About this article
Cite this article
Liu, M., Yang, Z., Han, W. et al. Semi-supervised multi-view binary learning for large-scale image clustering. Appl Intell 52, 14853–14870 (2022). https://doi.org/10.1007/s10489-022-03205-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-03205-z