Abstract
In recent years, hashing technologies have garnered substantial attention and achieved notable results due to their low storage costs and excellent retrieval efficiency. However, the majority of existing approaches build a massive pairwise similarity matrix to maintain the similarity relationship in the original space, which can easily produce huge time-space overhead and lose the class information, making these approaches unscalable to large-scale multimedia datasets. Additionally, the majority of cross-modal techniques concurrently learn the hash function and binary representations, which makes optimization more difficult. To tackle these issues, we developed a hashing approach called Semantic preserving Asymmetric discrete Hashing for cross-modal retrieval (SEAH), which aims to preserve the similarity metric based on the global semantic information and the local similarity structure. Specifically, SEAH adopts an asymmetric learning scheme and embeds class attribute information to boost the discriminating strength of the learned binary codes. Then, SEAH employs a well-designed optimization algorithm to achieve efficient iterative optimization, thus avoiding the quantization error problem. In addition, the proposed SEAH is a two-stage approach; two algorithms, SEAH-t and SEAH-s, are developed in the second stage. The first one adopts linear classifiers as hash functions, while the second is a semantic-enhanced strategy utilizing distance-distance difference minimization to improve the ability of the to-be-learnedhash functions. Extensive experiments on three frequently used benchmark datasets highlight that the proposed SEAH-t and SEAH-s are not only superior to several state-of-the-art approaches but also retain their query and storage efficiency.
Similar content being viewed by others
Code Availability
Data available on request from the authors.
References
Chen ZD, Wang Y, Li HQ et al (2021) A two-step cross-modal hashing by exploiting label correlations and preserving similarity in both steps. In: Proceedings of the 27th ACM international conference on multimedia. https://doi.org/10.1145/3343031.3350862
Yang F, Liu YF, Ding XJ et al (2022) Asymmetric cross-modal hashing with high-level semantic similarity. Pattern Recogn 130:108823. https://doi.org/10.1016/j.patcog.2022.108823
Hu P, Zhu HY, Lin J et al (2022) Unsupervised contrastive cross-modal hashing. IEEE Trans Pattern Anal Mach Intell. https://doi.org/10.1109/TPAMI.2022.3177356https://doi.org/10.1109/TPAMI.2022.3177356
Fang YX, Zhang HX, Ren YW (2019) Unsupervised cross-modal retrieval via multi-modal graph regularized smooth matrix factorization hashing. Knowl-Based Syst 171:69–80. https://doi.org/10.1016/j.knosys.2019.02.004
Cheng D, Yang E, Liu T, et al. (2019) Unsupervised semantic-preserving adversarial hashing for image search. IEEE Trans Image Process 28:4032–4044. https://doi.org/10.1109/TIP.2019.2903661
Zhang PF, Li Y, Huang Z, et al. (2022) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479. https://doi.org/10.1109/TMM.2021.3053766
Zhang PF, Li Y, Huang Z et al (2022) Aggregation-based graph convolutional hashing for unsupervised cross-modal retrieval. IEEE Trans Multimedia 24:466–479. https://doi.org/10.1109/TMM.2021.3053766
Luo X, Wu Y, Xu XS (2018) Scalable supervised discrete hashing for large-scale search. In: Proceedings of the world wide web conference, pp 1603–1612
Yu J, Wu XJ, Kittler J (2020) Learning discriminative hashing codes for cross-modal retrieval based on multi-view features. Pattern Anal Applic 28(3):1421–1438. https://doi.org/10.1007/s10044-020-00870-z
Liu Y, Ji S, Fu Q et al (2022) Latent semantic-enhanced discrete hashing for cross-modal retrieval. Appl Intell. https://doi.org/10.1007/s10489-021-03143-2
Luo X, Yin XY, Nie L et al (2018) Sdmch: Supervised discrete manifold-embedded cross-modal hashing. In: Proceedings of the twenty-seventh international joint conference on artificial intelligence, pp 2518–2524. https://doi.org/10.24963/ijcai.2018/349
Lu X, Zhu L, Li J et al (2020) Efficient supervised discrete multi-view hashing for large-scale multimedia search. IEEE Trans Multimedia 22(8):2048–2060. https://doi.org/10.1109/TMM.2019.2947358
Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the 28th AAAI conference on artificial intelligence, pp 2177–2183
Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the 28th international conference on computer vision and pattern recognition, vol 25(7), pp 3864–3872. https://doi.org/10.1109/CVPR.2015.7299011
Zhang PF, Li CX, Liu MY et al (2017) Semi-relaxation supervised hashing for cross-modal retrieval. In: Proceedings of the 25th ACM international conference on multimedia, pp 1762–1770. https://doi.org/10.1145/3123266.3123320
Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross-modal retrieval. IEEE Trans Image Process 25(7):3157–3166. https://doi.org/10.1109/TIP.2016.2564638
Wang YX, Luo X, Nie L, et al. (2020) Batch: a scalable asymmetric discrete cross-modal hashing. IEEE Trans Knowl Data Eng 33(11):3507–3519. https://doi.org/10.1109/TKDE.2020.2974825
Wang D, Gao XB, Wang X, et al. (2019) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. IEEE Trans Pattern Anal Mach Intell 41(10):2466–2479. https://doi.org/10.1109/TPAMI.2018.2861000
Ma D, Liang J, Kong X et al (2016) Discrete cross-modal hashing for efficient multimedia retrieval. In: 2016 IEEE international symposium on multimedia (ISM), pp 38–43. https://doi.org/10.1109/ISM.2016.0017
Xu X, Shen F, Yang Y et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. IEEE Trans Image Process 26(5):2494–2507. https://doi.org/10.1109/TIP.2017.2676345
Wang D, Zhang C, Wang Q, et al. (2022) Hierarchical semantic structure preserving hashing for cross-modal retrieval. IEEE Trans Multimedia:2494–2507. https://doi.org/10.1109/TMM.2022.3140656
Kang P, Lin ZH, Yang ZG et al (2022) Intra-class low-rank regularization for supervised and semi-supervised cross-modal retrieval. Appl Intell 52(1):35–54. https://doi.org/10.1007/s10489-021-02308-3
Lin L, Shu X (2022) Gaussian similarity preserving for cross-modal hashing. Neurocomputing 494:446–454. https://doi.org/10.1016/j.neucom.2022.04.125
Yu G, Liu X, Wang J, et al. (2022) Flexible cross-modal hashing. IEEE Trans Neural Netw Learn Syst 33(1):304–314. https://doi.org/10.1109/TNNLS.2020.3027729
Liu X, Yu JG. Wang, Xiao G et al (2019) Weakly-supervised cross-modal hashing. IEEE Trans Big Data. https://doi.org/10.1109/TBDATA.2019.2954516
Chun S, Oh SJ, Sampaio de Rezende R, et al. (2021) Probabilistic embeddings for cross-modal retrieval. IEEE Conf Comput Vis Pattern Recognit:8411–8420, https://doi.org/10.1109/CVPR46437.2021.00831
Wang D, Wang Q, Gao X (2018) Robust and flexible discrete hashing for cross–modal similarity search. IEEE Trans Circuits Syst Video Technol 28(10):2703–2715. https://doi.org/10.1109/TCSVT.2017.2723302
Wang D, Wang Q, He L et al (2020) Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recogn 107479. https://doi.org/10.1016/j.patcog.2020.107479
Wang L, Yang L, Zareapoor M et al (2020) Cluster-wise unsupervised hashing for cross-modal similarity search. Pattern Recogn 107732. https://doi.org/10.1016/j.patcog.2020.107732
Shen X, Zhang H, Li L et al (2021) Clustering-driven deep adversarial hashing for scalable unsupervised cross-modal retrieval. Neurocomputing 459:152–164. https://doi.org/10.1016/j.neucom.2021.06.087
Hoang T, Do TT, Nguyen TV et al (2020) Unsupervised deep cross-modality spectral hashing. IEEE Trans Image Process 29:8391–8406. https://doi.org/10.1109/TIP.2020.3014727
Liu S, Qian S, Guan Y et al (2020) Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1379–1388. https://doi.org/10.1145/3397271.3401086
Yang D, Wu D, Zhang H et al (2020) Deep semantic-alignment hashing for unsupervised cross-modal retrieval. In: Proceedings of the 2020 international conference on multimedia retrieval, pp 44–52. https://doi.org/10.1145/3372278.3390673
Jiang QY, Li WJ (2019) Discrete latent factor model for cross-modal hashing. IEEE Trans Image Process 28(7):3490–3501. https://doi.org/10.1109/TIP.2019.2897944
Zhan YW, Wang Y, Sun Y et al (2022) Discrete online cross-modal hashing. Pattern Recogn 108262. https://doi.org/10.1016/j.patcog.2021.108262
Zhu L, Lu X, Cheng Z, et al. (2020) Flexible multi-modal hashing for scalable multimedia retrieval. ACM Trans Intell Syst Technol 11(2):1–20. https://doi.org/10.1145/3365841
Zhang D, Wu XJ, Yu J (2021) Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. ACM Trans Multimedia Comput Commun Appl 17(3):1–18. https://doi.org/10.1145/3446774
Chen ZD, Li CX, Luo X, et al. (2020) Scratch: a scalable discrete matrix factorization hashing framework for cross-modal retrieval. IEEE Trans Circuits Syst Video Technol 30(7):2262–2275. https://doi.org/10.1109/TCSVT.2019.2911359
Yao T, Han Y, Wang R et al (2020) Efficient discrete supervised hashing for large-scale cross-modal retrieval. Neurocomputing 385:358–367. https://doi.org/10.1016/j.neucom.2019.12.086
Fang Y, Ren Y, Park JH (2020) Semantic-enhanced discrete matrix factorization hashing for heterogeneous modal matching. Knowl-Based Syst 192:105381. https://doi.org/10.1016/j.knosys.2019.105381
Zhang M, Li J, Zhang H et al (2020) Deep semantic cross modal hashing with correlation alignment. Neurocomputing 381:240–251. https://doi.org/10.1016/j.neucom.2019.11.061
Yang Z, Long J, Zhu L et al (2020) Nonlinear robust discrete hashing for cross-modal retrieval. In: Proceedings of the 43rd international ACM SIGIR conference on research and development in information retrieval, pp 1349–1358. https://doi.org/10.1145/3397271.3401152
Wang Y, Chen ZD, Luo X et al (2021) Fast cross-modal hashing with global and local similarity embedding. IEEE Trans Cybern:1–14. https://doi.org/10.1109/tcyb.2021.3059886
Liu H, Ji RR, Wu YJ et al (2017) Cross-modality binary code learning via fusion similarity hashing. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6345–6353. https://doi.org/10.1109/CVPR.2017.672
Shen HT et al, L L, Yang Y (2021) Exploiting subspace relation in semantic labels for cross-modal hashing. IEEE Trans Knowl Data Eng 33(10):3351–3365. https://doi.org/10.1109/TKDE.2020.2970050
Wang S, Zhao H, Wang Y et al (2022) Cross-modal image–text search via efficient discrete class alignment hashing. Inf Process Manag 59(3):102886. https://doi.org/10.1016/j.ipm.2022.102886
Zhang DL, Wu XJ (2022) Robust and discrete matrix factorization hashing for cross-modal retrieval. Pattern Recogn 108343:122. https://doi.org/10.1016/j.patcog.2021.108343
Acknowledgements
The work was supported by the National Natural Science Foundation of China (Nos. 62002156, 61973151, 92046026), the Natural Science Foundation of Jiangsu Province (BK20191406, BK20200839), the Natural Science Foundation of the Jiangsu Higher Education Institutions of China (19KJB520035), the International Science and Technology Cooperation Project of Jiangsu Province (BZ2020008).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Xiao-jian Ding, Fu-min Ma, Jie Cao and De-yu Tong contributed equally to this work.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, F., Zhang, Qx., Ding, Xj. et al. Semantic preserving asymmetric discrete hashing for cross-modal retrieval. Appl Intell 53, 15352–15371 (2023). https://doi.org/10.1007/s10489-022-04282-w
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-022-04282-w