Semantic preserving asymmetric discrete hashing for cross-modal retrieval

Yang, Fan; Zhang, Qiao-xi; Ding, Xiao-jian; Ma, Fu-min; Cao, Jie; Tong, De-yu

doi:10.1007/s10489-022-04282-w

Semantic preserving asymmetric discrete hashing for cross-modal retrieval

Published: 15 November 2022

Volume 53, pages 15352–15371, (2023)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Fan Yang ORCID: orcid.org/0000-0001-6861-9596¹,
Qiao-xi Zhang¹,
Xiao-jian Ding¹,
Fu-min Ma¹,
Jie Cao¹ &
…
De-yu Tong¹

507 Accesses
1 Altmetric
Explore all metrics

Abstract

In recent years, hashing technologies have garnered substantial attention and achieved notable results due to their low storage costs and excellent retrieval efficiency. However, the majority of existing approaches build a massive pairwise similarity matrix to maintain the similarity relationship in the original space, which can easily produce huge time-space overhead and lose the class information, making these approaches unscalable to large-scale multimedia datasets. Additionally, the majority of cross-modal techniques concurrently learn the hash function and binary representations, which makes optimization more difficult. To tackle these issues, we developed a hashing approach called Semantic preserving Asymmetric discrete Hashing for cross-modal retrieval (SEAH), which aims to preserve the similarity metric based on the global semantic information and the local similarity structure. Specifically, SEAH adopts an asymmetric learning scheme and embeds class attribute information to boost the discriminating strength of the learned binary codes. Then, SEAH employs a well-designed optimization algorithm to achieve efficient iterative optimization, thus avoiding the quantization error problem. In addition, the proposed SEAH is a two-stage approach; two algorithms, SEAH-t and SEAH-s, are developed in the second stage. The first one adopts linear classifiers as hash functions, while the second is a semantic-enhanced strategy utilizing distance-distance difference minimization to improve the ability of the to-be-learnedhash functions. Extensive experiments on three frequently used benchmark datasets highlight that the proposed SEAH-t and SEAH-s are not only superior to several state-of-the-art approaches but also retain their query and storage efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete Similarity Preserving Hashing for Cross-modal Retrieval

A semantic-consistency asymmetric matrix factorization hashing method for cross-modal retrieval

Article 13 June 2023

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Article 19 March 2022

Code Availability

Data available on request from the authors.

References

Chen ZD, Wang Y, Li HQ et al (2021) A two-step cross-modal hashing by exploiting label correlations and preserving similarity in both steps. In: Proceedings of the 27th ACM international conference on multimedia. https://doi.org/10.1145/3343031.3350862
Yang F, Liu YF, Ding XJ et al (2022) Asymmetric cross-modal hashing with high-level semantic similarity. Pattern Recogn 130:108823. https://doi.org/10.1016/j.patcog.2022.108823
Article Google Scholar
Hu P, Zhu HY, Lin J et al (2022) Unsupervised contrastive cross-modal hashing. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3177356 https://doi.org/10.1109/TPAMI.2022.3177356
Fang YX, Zhang HX, Ren YW (2019) Unsupervised cross-modal retrieval via multi-modal graph regularized smooth matrix factorization hashing. Knowl-Based Syst 171:69–80. https://doi.org/10.1016/j.knosys.2019.02.004
Article Google Scholar
Cheng D, Yang E, Liu T, et al. (2019) Unsupervised semantic-preserving adversarial hashing for image search. IEEE Trans Image Process 28:4032–4044. https://doi.org/10.1109/TIP.2019.2903661
Article MathSciNet MATH Google Scholar
Zhang PF, Li Y, Huang Z, et al. (2022) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479. https://doi.org/10.1109/TMM.2021.3053766
Article Google Scholar
Zhang PF, Li Y, Huang Z et al (2022) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479. https://doi.org/10.1109/TMM.2021.3053766
Article Google Scholar
Luo X, Wu Y, Xu XS (2018) Scalable supervised discrete hashing for large-scale search. In: Proceedings of the world wide web conference, pp 1603–1612
Yu J, Wu XJ, Kittler J (2020) Learning discriminative hashing codes for cross-modal retrieval based on multi-view features. Pattern Anal Applic 28(3):1421–1438. https://doi.org/10.1007/s10044-020-00870-z
Article MathSciNet Google Scholar
Liu Y, Ji S, Fu Q et al (2022) Latent semantic-enhanced discrete hashing for cross-modal retrieval. Appl Intell. https://doi.org/10.1007/s10489-021-03143-2
Luo X, Yin XY, Nie L et al (2018) Sdmch: Supervised discrete manifold-embedded cross-modal hashing. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 2518–2524. https://doi.org/10.24963/ijcai.2018/349
Lu X, Zhu L, Li J et al (2020) Efficient supervised discrete multi-view hashing for large-scale multimedia search. IEEE Trans Multimedia 22(8):2048–2060. https://doi.org/10.1109/TMM.2019.2947358
Article Google Scholar
Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the 28th AAAI conference on artificial intelligence, pp 2177–2183
Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the 28th international conference on computer vision and pattern recognition, vol 25(7), pp 3864–3872. https://doi.org/10.1109/CVPR.2015.7299011
Zhang PF, Li CX, Liu MY et al (2017) Semi-relaxation supervised hashing for cross-modal retrieval. In: Proceedings of the 25th ACM international conference on multimedia, pp 1762–1770. https://doi.org/10.1145/3123266.3123320
Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans Image Process 25(7):3157–3166. https://doi.org/10.1109/TIP.2016.2564638
Article MathSciNet MATH Google Scholar
Wang YX, Luo X, Nie L, et al. (2020) Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans Knowl Data Eng 33(11):3507–3519. https://doi.org/10.1109/TKDE.2020.2974825
Article Google Scholar
Wang D, Gao XB, Wang X, et al. (2019) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans Pattern Anal Mach Intell 41(10):2466–2479. https://doi.org/10.1109/TPAMI.2018.2861000
Article Google Scholar
Ma D, Liang J, Kong X et al (2016) Discrete cross-modal hashing for efficient multimedia retrieval. In: 2016 IEEE international symposium on multimedia (ISM), pp 38–43. https://doi.org/10.1109/ISM.2016.0017
Xu X, Shen F, Yang Y et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507. https://doi.org/10.1109/TIP.2017.2676345
Article MathSciNet MATH Google Scholar
Wang D, Zhang C, Wang Q, et al. (2022) Hierarchical semantic structure preserving hashing for cross-modal retrieval. IEEE Trans Multimedia:2494–2507. https://doi.org/10.1109/TMM.2022.3140656
Kang P, Lin ZH, Yang ZG et al (2022) Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval. Appl Intell 52(1):35–54. https://doi.org/10.1007/s10489-021-02308-3
Article Google Scholar
Lin L, Shu X (2022) Gaussian similarity preserving for cross-modal hashing. Neurocomputing 494:446–454. https://doi.org/10.1016/j.neucom.2022.04.125
Article Google Scholar
Yu G, Liu X, Wang J, et al. (2022) Flexible cross-modal hashing. IEEE Trans Neural Netw Learn Syst 33(1):304–314. https://doi.org/10.1109/TNNLS.2020.3027729
Article Google Scholar
Liu X, Yu JG. Wang, Xiao G et al (2019) Weakly-supervised cross-modal hashing. IEEE Trans Big Data. https://doi.org/10.1109/TBDATA.2019.2954516
Chun S, Oh SJ, Sampaio de Rezende R, et al. (2021) Probabilistic embeddings for cross-modal retrieval. IEEE Conf Comput Vis Pattern Recognit:8411–8420, https://doi.org/10.1109/CVPR46437.2021.00831
Wang D, Wang Q, Gao X (2018) Robust and flexible discrete hashing for cross–modal similarity search. IEEE Trans Circuits Syst Video Technol 28(10):2703–2715. https://doi.org/10.1109/TCSVT.2017.2723302
Article Google Scholar
Wang D, Wang Q, He L et al (2020) Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recogn 107479. https://doi.org/10.1016/j.patcog.2020.107479
Wang L, Yang L, Zareapoor M et al (2020) Cluster-wise unsupervised hashing for cross-modal similarity search. Pattern Recogn 107732. https://doi.org/10.1016/j.patcog.2020.107732
Shen X, Zhang H, Li L et al (2021) Clustering-driven deep adversarial hashing for scalable unsupervised cross-modal retrieval. Neurocomputing 459:152–164. https://doi.org/10.1016/j.neucom.2021.06.087
Article Google Scholar
Hoang T, Do TT, Nguyen TV et al (2020) Unsupervised deep cross-modality spectral hashing. IEEE Trans Image Process 29:8391–8406. https://doi.org/10.1109/TIP.2020.3014727
Article MathSciNet MATH Google Scholar
Liu S, Qian S, Guan Y et al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388. https://doi.org/10.1145/3397271.3401086
Yang D, Wu D, Zhang H et al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52. https://doi.org/10.1145/3372278.3390673
Jiang QY, Li WJ (2019) Discrete latent factor model for cross-modal hashing. IEEE Trans Image Process 28(7):3490–3501. https://doi.org/10.1109/TIP.2019.2897944
Article MathSciNet MATH Google Scholar
Zhan YW, Wang Y, Sun Y et al (2022) Discrete online cross-modal hashing. Pattern Recogn 108262. https://doi.org/10.1016/j.patcog.2021.108262
Zhu L, Lu X, Cheng Z, et al. (2020) Flexible multi-modal hashing for scalable multimedia retrieval. ACM Trans Intell Syst Technol 11(2):1–20. https://doi.org/10.1145/3365841
Article Google Scholar
Zhang D, Wu XJ, Yu J (2021) Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Trans Multimedia Comput Commun Appl 17(3):1–18. https://doi.org/10.1145/3446774
Article Google Scholar
Chen ZD, Li CX, Luo X, et al. (2020) Scratch: a scalable discrete matrix factorization hashing framework for cross-modal retrieval. IEEE Trans Circuits Syst Video Technol 30(7):2262–2275. https://doi.org/10.1109/TCSVT.2019.2911359
Article Google Scholar
Yao T, Han Y, Wang R et al (2020) Efficient discrete supervised hashing for large-scale cross-modal retrieval. Neurocomputing 385:358–367. https://doi.org/10.1016/j.neucom.2019.12.086
Article Google Scholar
Fang Y, Ren Y, Park JH (2020) Semantic-enhanced discrete matrix factorization hashing for heterogeneous modal matching. Knowl-Based Syst 192:105381. https://doi.org/10.1016/j.knosys.2019.105381
Article Google Scholar
Zhang M, Li J, Zhang H et al (2020) Deep semantic cross modal hashing with correlation alignment. Neurocomputing 381:240–251. https://doi.org/10.1016/j.neucom.2019.11.061
Article Google Scholar
Yang Z, Long J, Zhu L et al (2020) Nonlinear robust discrete hashing for cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1349–1358. https://doi.org/10.1145/3397271.3401152
Wang Y, Chen ZD, Luo X et al (2021) Fast cross-modal hashing with global and local similarity embedding. IEEE Trans Cybern:1–14. https://doi.org/10.1109/tcyb.2021.3059886
Liu H, Ji RR, Wu YJ et al (2017) Cross-modality binary code learning via fusion similarity hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6345–6353. https://doi.org/10.1109/CVPR.2017.672
Shen HT et al, L L, Yang Y (2021) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng 33(10):3351–3365. https://doi.org/10.1109/TKDE.2020.2970050
Wang S, Zhao H, Wang Y et al (2022) Cross-modal image–text search via efficient discrete class alignment hashing. Inf Process Manag 59(3):102886. https://doi.org/10.1016/j.ipm.2022.102886
Article Google Scholar
Zhang DL, Wu XJ (2022) Robust and discrete matrix factorization hashing for cross-modal retrieval. Pattern Recogn 108343:122. https://doi.org/10.1016/j.patcog.2021.108343
Article Google Scholar

Download references

Acknowledgements

The work was supported by the National Natural Science Foundation of China (Nos. 62002156, 61973151, 92046026), the Natural Science Foundation of Jiangsu Province (BK20191406, BK20200839), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (19KJB520035), the International Science and Technology Cooperation Project of Jiangsu Province (BZ2020008).

Author information

Authors and Affiliations

College of Information Engineering, Nanjing University of Finance and Economics, Nanjing, 210023, China
Fan Yang, Qiao-xi Zhang, Xiao-jian Ding, Fu-min Ma, Jie Cao & De-yu Tong

Authors

Fan Yang
View author publications
You can also search for this author in PubMed Google Scholar
Qiao-xi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao-jian Ding
View author publications
You can also search for this author in PubMed Google Scholar
Fu-min Ma
View author publications
You can also search for this author in PubMed Google Scholar
Jie Cao
View author publications
You can also search for this author in PubMed Google Scholar
De-yu Tong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fan Yang.

Ethics declarations

Conflict of Interests

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Xiao-jian Ding, Fu-min Ma, Jie Cao and De-yu Tong contributed equally to this work.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, F., Zhang, Qx., Ding, Xj. et al. Semantic preserving asymmetric discrete hashing for cross-modal retrieval. Appl Intell 53, 15352–15371 (2023). https://doi.org/10.1007/s10489-022-04282-w

Download citation

Accepted: 18 October 2022
Published: 15 November 2022
Issue Date: June 2023
DOI: https://doi.org/10.1007/s10489-022-04282-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semantic preserving asymmetric discrete hashing for cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Discrete Similarity Preserving Hashing for Cross-modal Retrieval

A semantic-consistency asymmetric matrix factorization hashing method for cross-modal retrieval

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Code Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Semantic preserving asymmetric discrete hashing for cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Discrete Similarity Preserving Hashing for Cross-modal Retrieval

A semantic-consistency asymmetric matrix factorization hashing method for cross-modal retrieval

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Code Availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation