Skip to main content

Advertisement

Log in

Deep contrastive multi-view clustering with doubly enhanced commonality

  • Regular Paper
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Recently, deep multi-view clustering leveraging autoencoders has garnered significant attention due to its ability to simultaneously enhance feature learning capabilities and optimize clustering outcomes. However, existing autoencoder-based deep multi-view clustering methods often exhibit a tendency to either overly emphasize view-specific information, thus neglecting shared information across views, or alternatively, to place undue focus on shared information, resulting in the dilution of complementary information from individual views. Given the principle that commonality resides within individuality, this paper proposes a staged training approach that comprises two phases: pre-training and fine-tuning. The pre-training phase primarily focuses on learning view-specific information, while the fine-tuning phase aims to doubly enhance commonality across views while maintaining these specific details. Specifically, we learn and extract the specific information of each view through the autoencoder in the pre-training stage. After entering the fine-tuning stage, we first initially enhance the commonality between independent specific views through the transformer layer, and then further strengthen these commonalities through contrastive learning on the semantic labels of each view, so as to obtain more accurate clustering results.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data availability

No datasets were generated or analysed during the current study.

References

  1. Chen, J., Yang, S., Peng, X., Peng, D., Wang, Z.: Augmented sparse representation for incomplete multiview clustering. IEEE Trans. Neural Netw. Learn Syst. 35(3), 4058–4071 (2022)

    Article  Google Scholar 

  2. Xu, J., Ren, Y., Tang, H., Yang, Z., Pan, L., Yang, Y., Pu, X., Philip, S.Y., He, L.: Self-supervised discriminative feature learning for deep multi-view clustering. IEEE Trans. Knowl. Data Eng. 35(7), 7470–7482 (2022)

    Article  Google Scholar 

  3. Li, Y., Yang, M., Zhang, Z.: A survey of multi-view representation learning. IEEE Trans. Knowl. Data Eng. 31(10), 1863–1883 (2018)

    Article  Google Scholar 

  4. Wang, C., Pan, S., Hu, R., Long, G., Jiang, J., Zhang, C.: Attributed graph clustering: a deep attentional embedding approach. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence, pp. 3670–3676 (2019)

  5. Du, G., Zhou, L., Li, Z., Wang, L., Lü, K.: Neighbor-aware deep multi-view clustering via graph convolutional network. Inf. Fusion 93, 330–343 (2023)

    Article  Google Scholar 

  6. Xie, Y., Lin, B., Qu, Y., Li, C., Zhang, W., Ma, L., Wen, Y., Tao, D.: Joint deep multi-view learning for image clustering. IEEE Trans. Knowl. Data Eng. 33(11), 3594–3606 (2020)

    Article  Google Scholar 

  7. Tao, Z., Liu, H., Fu, H., Fu, Y.: Multi-view saliency-guided clustering for image cosegmentation. IEEE Trans. Image Process. 28(9), 4634–4645 (2019)

    Article  MathSciNet  Google Scholar 

  8. Xue, Z., Li, G., Wang, S., Huang, J., Zhang, W., Huang, Q.: Beyond global fusion: a group-aware fusion approach for multi-view image clustering. Inf. Sci. 493, 176–191 (2019)

    Article  MathSciNet  Google Scholar 

  9. Nie, F., Cai, G., Li, J., Li, X.: Auto-weighted multi-view learning for image clustering and semi-supervised classification. IEEE Trans. Image Process. 27(3), 1501–1511 (2017)

    Article  MathSciNet  Google Scholar 

  10. Fang, U., Li, M., Li, J., Gao, L., Jia, T., Zhang, Y.: A comprehensive survey on multi-view clustering. IEEE Trans. Knowl. Data Eng. 35(12), 12350–12368 (2023)

    Article  Google Scholar 

  11. Zhao, W., Xu, C., Guan, Z., Liu, Y.: Multiview concept learning via deep matrix factorization. IEEE Trans. Neural Netw. Learn. Syst. 32(2), 814–825 (2020)

    Article  MathSciNet  Google Scholar 

  12. Khan, G.A., Hu, J., Li, T., Diallo, B., Wang, H.: Multi-view data clustering via non-negative matrix factorization with manifold regularization. Int. J Mach Learn. Cybern. 13, 1–13 (2022)

    Article  Google Scholar 

  13. Chen, J., Yang, S., Mao, H., Fahy, C.: Multiview subspace clustering using low-rank representation. IEEE Trans. Cybern. 52(11), 12364–12378 (2021)

    Article  Google Scholar 

  14. Lan, S., Zheng, Q., Yu, Y.: Double-level view-correlation multi-view subspace clustering. Knowl.-Based Syst. 284, 111271 (2024)

    Article  Google Scholar 

  15. Gao, H., Nie, F., Li, X., Huang, H.: Multi-view subspace clustering. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4238–4246 (2015)

  16. Rong, W., Zhuo, E., Peng, H., Chen, J., Wang, H., Han, C., Cai, H.: Learning a consensus affinity matrix for multi-view clustering via subspaces merging on Grassmann manifold. Inf. Sci. 547, 68–87 (2021)

    Article  MathSciNet  Google Scholar 

  17. Wang, H., Yang, Y., Liu, B.: Gmc: Graph-based multi-view clustering. IEEE Trans. Knowl. Data Eng. 32(6), 1116–1129 (2019)

    Article  Google Scholar 

  18. Huang, S., Tsang, I.W., Xu, Z., Lv, J.: Measuring diversity in graph learning: a unified framework for structured multi-view clustering. IEEE Trans. Knowl. Data Eng. 34(12), 5869–5883 (2021)

    Article  Google Scholar 

  19. Wang, Y., Chang, D., Fu, Z., Zhao, Y.: Consistent multiple graph embedding for multi-view clustering. IEEE Trans. Multimed. 25, 1008–1018 (2021)

    Article  Google Scholar 

  20. Wang, H., Yao, M., Jiang, G., Mi, Z., Fu, X.: Graph-collaborated auto-encoder hashing for multiview binary clustering. IEEE Trans. Neural Netw. Learn Syst. 13, 1–13 (2023)

    Article  Google Scholar 

  21. Jiang, G., Peng, J., Wang, H., Mi, Z., Fu, X.: Tensorial multi-view clustering via low-rank constrained high-order graph learning. IEEE Trans. Circuits Syst. Video Technol. 32(8), 5307–5318 (2022)

    Article  Google Scholar 

  22. Wang, H., Jiang, G., Peng, J., Deng, R., Fu, X.: Towards adaptive consensus graph: multi-view clustering via graph collaboration. IEEE Trans. Multimedia 25, 6629–6641 (2022)

    Article  Google Scholar 

  23. Von Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17, 395–416 (2007)

    Article  MathSciNet  Google Scholar 

  24. Hu, S., Zou, G., Zhang, C., Lou, Z., Geng, R., Ye, Y.: Joint contrastive triple-learning for deep multi-view clustering. Inf. Process. Manag. 60(3), 103284 (2023)

    Article  Google Scholar 

  25. Chen, J., Mao, H., Woo, W.L., Peng, X.: Deep multiview clustering by contrasting cluster assignments. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 16752–16761 (2023)

  26. Xu, J., Tang, H., Ren, Y., Peng, L., Zhu, X., He, L.: Multi-level feature learning for contrastive multi-view clustering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 16051–16060 (2022)

  27. Wang, Q., Cheng, J., Gao, Q., Zhao, G., Jiao, L.: Deep multi-view subspace clustering with unified and discriminative learning. IEEE Trans. Multimed. 23, 3483–3493 (2020)

    Article  Google Scholar 

  28. Yang, Y., Guan, Z., Zhao, W., Lu, W., Zong, B.: Graph substructure assembling network with soft sequence and context attention. IEEE Trans. Knowl. Data Eng. 35(5), 4894–4907 (2022)

    Google Scholar 

  29. Yang, Y., Guan, Z., Li, J., Zhao, W., Cui, J., Wang, Q.: Interpretable and efficient heterogeneous graph convolutional network. IEEE Trans. Knowl. Data Eng. 35(2), 1637–1650 (2021)

    Google Scholar 

  30. Xia, W., Wang, Q., Gao, Q., Zhang, X., Gao, X.: Self-supervised graph convolutional network for multi-view clustering. IEEE Trans. Multimed. 24, 3182–3192 (2021)

    Article  Google Scholar 

  31. Diallo, B., Hu, J., Li, T., Khan, G.A., Liang, X., Wang, H.: Auto-attention mechanism for multi-view deep embedding clustering. Pattern Recogn. 143, 109764 (2023)

    Article  Google Scholar 

  32. Ke, G., Hong, Z., Yu, W., Zhang, X., Liu, Z.: Efficient multi-view clustering networks. Appl. Intell. 52(13), 14918–14934 (2022)

    Article  Google Scholar 

  33. Lu, R.-K., Liu, J.-W., Zuo, X.: Attentive multi-view deep subspace clustering net. Neurocomputing 435, 186–196 (2021)

    Article  Google Scholar 

  34. Le-Khac, P.H., Healy, G., Smeaton, A.F.: Contrastive representation learning: a framework and review. IEEE Access 8, 193907–193934 (2020)

    Article  Google Scholar 

  35. Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607 (2020). PMLR

  36. Tian, Y., Sun, C., Poole, B., Krishnan, D., Schmid, C., Isola, P.: What makes for good views for contrastive learning? Adv. Neural. Inf. Process. Syst. 33, 6827–6839 (2020)

    Google Scholar 

  37. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need. Adv. Neural. Inf. Process. Syst. 30, 5998–6008 (2017)

    Google Scholar 

  38. Winn, J., Jojic, N.: Locus: learning object classes with unsupervised segmentation. In: Tenth IEEE International Conference on Computer Vision (ICCV’05) vol. 1, pp. 756–763 (2005). IEEE

  39. Nene, S.A.: Columbia object image library(coil-20). Tech Rep 5, (1996)

  40. Fei-Fei, L., Perona, P.: A bayesian hierarchical model for learning natural scene categories. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), vol. 2, pp. 524–531 (2005). IEEE

  41. Xiao, H., Rasul, K., Vollgraf, R.: Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. arXiv preprint arXiv:1708.07747 (2017)

  42. Cai, X., Wang, H., Huang, H., Ding, C.: Joint stage recognition and anatomical annotation of drosophila gene expression patterns. Bioinformatics 28(12), 16–24 (2012)

    Article  Google Scholar 

  43. Liu, X., Zhu, X., Li, M., Tang, C., Zhu, E., Yin, J., Gao, W.: Efficient and effective incomplete multi-view clustering. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, pp. 4392–4399 (2019)

  44. Chen, J., Yang, S., Peng, X., Peng, D., Wang, Z.: Augmented sparse representation for incomplete multiview clustering. IEEE Trans. Neural Netw. Learn Syst. 35(3), 4058–4071 (2022)

    Article  Google Scholar 

  45. Tang, H., Liu, Y.: Deep safe incomplete multi-view clustering: theorem and algorithm. In: International Conference on Machine Learning, pp. 21090–21110 (2022). PMLR

  46. Lin, Y., Gou, Y., Liu, X., Bai, J., Lv, J., Peng, X.: Dual contrastive prediction for incomplete multi-view representation learning. IEEE Trans. Pattern Anal. Mach. Intell. 45(4), 4447–4461 (2022)

    Google Scholar 

  47. Tang, H., Liu, Y.: Deep safe multi-view clustering: Reducing the risk of clustering performance degradation caused by view increase. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 202–211 (2022)

  48. Zheng, Q., Zhu, J., Li, Z., Tian, Z., Li, C.: Comprehensive multi-view representation learning. Inf. Fusion 89, 198–209 (2023)

    Article  Google Scholar 

  49. Bian, J., Xie, X., Lai, J.-H., Nie, F.: Multi-view contrastive clustering via integrating graph aggregation and confidence enhancement. Inf. Fusion 108, 102393 (2024)

    Article  Google Scholar 

  50. Maaten, L., Hinton, G.: Visualizing data using t-sne. J Mach. Learn. Res. 9(11), 2579–2605 (2008)

    Google Scholar 

Download references

Acknowledgements

This work is supported by National Natural Science Foundation of China (CN) [62276164, 61602296], ‘Science and technology innovation action plan’ Natural Science Foundation of Shanghai [22ZR1427000], and Shanghai Oriental Talent Program-Youth Program. The authors would like to thank their supports.

Funding

Natural Science Foundation of Shanghai (22ZR1427000), National Natural Science Foundation of China (CN) (62276164, 61602296).

Author information

Authors and Affiliations

Authors

Contributions

Author Yang Zhiyuan played a central role in this study. He not only put forward innovative ideas, but also personally wrote the codes required for the experiment and carefully set up the experimental environment to ensure the smooth progress of the research. Meanwhile, Authors Zhu Changming and Li Zishi focused on the review of the document. With profound professional backgrounds and rigorous attitudes, they carefully reviewed the document, providing strong guarantees for the accuracy and integrity of the research.

Corresponding author

Correspondence to Changming Zhu.

Ethics declarations

Conflict of interest

The authors declare no conflict of interest.

Additional information

Communicated by Yongdong Zhang.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, Z., Zhu, C. & Li, Z. Deep contrastive multi-view clustering with doubly enhanced commonality. Multimedia Systems 30, 196 (2024). https://doi.org/10.1007/s00530-024-01400-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s00530-024-01400-1

Keywords