Skip to main content
Log in

A semantic-consistency asymmetric matrix factorization hashing method for cross-modal retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Hashing methods have recently received widespread attention due to their flexibility and effectiveness for cross-modal retrieval tasks. However, existing cross-modal hashing methods have a common challenging problem, how to effectively exploit semantic information to learn discriminative hash codes while saving storage and computation cost. To address this issue, in this paper, we propose an efficient Semantic-consistency Asymmetric Matrix Factorization Hashing (SAMFH) method. Specifically, this method first leverages matrix factorization to obtain the latent semantic representations for different modalities and the label representation for class label information. To further utilize semantic information and learn discriminative binary codes, we adopt an asymmetric supervised learning strategy to fuse the pairwise semantic matrix into the framework. Finally, we directly update unified hash codes with an efficient discrete optimization strategy. Experimental results on three benchmark datasets demonstrate that our SAMFH method outperforms many state-of-the-art cross-modal hashing methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Algorithm 1
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Bronstein, MM, Bronstein, AM, Michel, F, Paragios, N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. Proceedings of the 2010 IEEE computer society conference on computer vision and pattern recognition, pp 3594–3601

  2. Chatfield, K, Karen, S, Vedaldi, A, Zisserman, A (2014) Return of the devil in the details: delving deep into convolutional nets. arXiv preprint arXiv:1405.3531

  3. Chua, TS, Tang, J, Hong, R (2009) NUS-WIDE: a real-world web image Database from National University of Singapore. ACM Conf Image Vid Retriev, 1–9

  4. Deng C, Chen Z, Liu X (2018) Triplet-based deep hashing network for cross-modal retrieval. IEEE Trans Image Process 27(8):3893–3903

    Article  MathSciNet  Google Scholar 

  5. Di, W, Gao, X, Wang, X, He, L (2015) Semantic topic multimodal hashing for cross-media retrieval. Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, 3890–3896

  6. Ding, G, Guo, Y, Zhou, J (2014) Collective matrix factorization hashing for multimodal data. IEEE Conference on Computer Vision Pattern Recognition, 20–2082

  7. Fang Y, Ren Y (2020) Supervised discrete cross-modal hashing based on kernel discriminant analysis. Pattern Recogn 98:107062

    Article  Google Scholar 

  8. Fang Y, Zhang H, Ren Y (2019) Unsupervised cross-modal retrieval via multimodal graph regularized smooth matrix factorization hashing. Knowl-Based Syst 171:69–80

    Article  Google Scholar 

  9. Hu M, Yang Y, Shen F, Xie N, Hong R, Shen HT (2019) Collective reconstructive embeddings for cross-modal hashing. IEEE Trans Image Process 28(6):2770–2784

    Article  MathSciNet  Google Scholar 

  10. Huiskes, MJ, Lew, MS (2008) The MIRFlickr Retrieval Evaluation. The 1st ACM international conference on Multimedia information retrieval, pp, 39–43

  11. Jiang, QY, Li, WJ (2017) Deep cross-modal hashing. Proc IEEE Conf Comput Vis Pattern Recognit pp 3270–3278

  12. Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. International joint conference on artificial intelligence. AAAI press, 1360–1365

  13. Li C, Chen D, Zhang P, Luo X, Xu X (2018) SCRATCH: a scalable discrete matrix factorization hashing for cross-modal retrieval. Proceedings of the 26th ACM International Conference on Multimedia pp 1–9

  14. Li C, Deng C, Li N, Liu W, Gao X, Tao D (2018) Self-supervised adversarial hashing networks for cross-modal retrieval. Proc IEEE Conf Comput Vis Pattern Recognit pp 4242–4251

  15. Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. IEEE Conf Comput Vis Pattern Recognit pp 3864–3872

  16. Liu W, Wang J, Ji R, Jiang Y, Chang S (2012) Supervised hashing with kernels. Proc IEEE Conf Comput Vis Pattern Recognit, pp 2074–2081

  17. Liu W, Mu C, Kumar S, Chang S (2014) Discrete graph hashing. International Conference on Neural Information Processing Systems, pp, 3419–3427

  18. Liu H, Ji R, Wu Y, Huang F, Zhang B (2017) Cross-modality binary code learning via fusion similarity hashing. Proc IEEE Conf Comput Vis Pattern Recognit, pp 7380–7388

  19. Lu X, Zhu L, Cheng Z (2019) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal Process 154:217–231

    Article  Google Scholar 

  20. Lu X, Zhu L, Cheng Z, Nie L, Zhang H (2019) Online multimodal hashing with dynamic query-adaption. Proceedings of the 42nd International ACM SIGIR Conference, pp 715–724

  21. Luo X, Yin X, Nie L, Song X, Wang Y (2018) SDMCH: supervised discrete manifold-embedded cross-modal hashing. Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 2518–2524

  22. Mandai D, Biswas S (2018) Label consistent matrix factorization based hashing for cross-modal retrieval. 2017 IEEE International conference on image processing, pp 2901–2905

  23. Peng Y, Huang X, Zhao Y (2018) An overview of cross-media retrieval: concepts, methodologies, benchmarks, and challenges. IEEE Trans Circ Syst Vid Technol 28(9):2372–2385

    Article  Google Scholar 

  24. Rasiwasia N, Pereira JC, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. Proceedings of the 18th ACM international conference on multimedia, pp 251–260

  25. Shao J, Zhao Z, Su F (2018) Two-stage deep learning for supervised cross-modal retrieval. Multimed Tools Appl 78(12):16615–16631

    Article  Google Scholar 

  26. Shen F, Shen, C, Liu, W, Shen, HT (2015) Supervised discrete hashing. In Proc IEEE Conf Comput Vis Pattern Recognit, pp 37–45

  27. Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. Int Conf Manag Data, 785–796

  28. Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross-modal retrieval. IEEE Transactions on Image Processing, 25(7):3157–3166

    Article  MathSciNet  Google Scholar 

  29. Wang K, Yin Q, Wang W, Wu S, Wang L (2016) A comprehensive survey on cross-modal retrieval. arXivpreprint:1607

  30. Wang D, Wang Q, Gao X (2017) Robust and flexible discrete hashing for cross-modal similarity search. IEEE Trans Circ Syst Vid Technol 28(10):2703–2715

    Article  Google Scholar 

  31. Wang D, Gao X, Wang X (2018) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans Pattern Anal Mach Intell 41(10):2466–2479

    Article  Google Scholar 

  32. Wang T, Zhu L, Cheng Z, Li J, Gao Z (2020) Unsupervised deep cross- modal hashing with virtual label regression. Neurocomput 386:84–96

    Article  Google Scholar 

  33. Wang Y, Luo X, Nie L, Song J, Xu XS (2020) Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans Knowl Data Eng 33(11):3507–3519

    Article  Google Scholar 

  34. Wang Y, Chen ZD, Luo X et al (2021) Fast cross-modal hashing with global and local similarity embedding. IEEE Transactions on Cybernetics 52(10):10064–10077

  35. Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507

    Article  MathSciNet  Google Scholar 

  36. Yan J, Zhang H, Sun J, Wang Q, Guo P, Meng L, Dong X (2018) Joint graph regular- ization based modality-dependent cross-media retrieval. Multimed Tools Appl 77(3):3009–3027

    Article  Google Scholar 

  37. Yang E, Deng C, Liu W, Gao X (2017) Pairwise relationship guided deep hashing for cross-modal retrieval. The thirty-first AAAI conference on artificial intelligence, 1618–1625

  38. Yao T, Kong X, Fu H, Tian Q (2016) Semantic consistency hashing for cross-modal retrieval. Neurocomputing, pp 250–259

  39. Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. Association for the advancement of artificial intelligence, pp 2177–2183

  40. Zhang J, Peng Y, Yuan M (2017) Unsupervised generative adversarial cross-modal hashing. Proceedings of the thirty-second international joint conference on artificial intelligence, pp 539–546

  41. Zhang M, Zhang H, Lie J, Fang Y, Wang L, Shang F (2019) Multi-modal graph regularization based class center discriminant analysis for cross modal retrieval. Multimed Tools Appl 78(19):28285–28307

    Article  Google Scholar 

  42. Zhong F, Chen Z, Min G, Xia F (2020) A novel strategy to balance the results of cross-modal hashing. Pattern Recogn 107(8):107523

    Article  Google Scholar 

  43. Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross- modal similarity search. Proceedings of the 37th international ACM sigir conference on research & development in information retrieval, pp 415–424

Download references

Acknowledgments

This paper is supported by the Natural Science Foundation of China (71772107, 62072288), the Natural Science Foundation of Shandong Province of China (ZR2020MF044, ZR202102230289, ZR2019MF003, ZR2021MF104), Shandong Education Quality Improvement Plan for Postgraduate (2021), the SDUST Research Fund, Humanity and Social Science Fund of the Ministry of Education under Grant 20YJAZH078 and 20YJAZH127.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shujuan Ji.

Ethics declarations

Conflict of interest

The authors have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, Y., Ji, S., Fu, Q. et al. A semantic-consistency asymmetric matrix factorization hashing method for cross-modal retrieval. Multimed Tools Appl 83, 6621–6649 (2024). https://doi.org/10.1007/s11042-023-15535-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15535-2

Keywords

Navigation