Latent semantic-enhanced discrete hashing for cross-modal retrieval

Liu, Yun; Ji, Shujuan; Fu, Qiang; Zhao, Jianli; Zhao, Zhongying; Gong, Maoguo

doi:10.1007/s10489-021-03143-2

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Published: 19 March 2022

Volume 52, pages 16004–16020, (2022)
Cite this article

Applied Intelligence Aims and scope Submit manuscript

Yun Liu¹,
Shujuan Ji ORCID: orcid.org/0000-0003-2650-0161¹,
Qiang Fu²,
Jianli Zhao¹,
Zhongying Zhao¹ &
…
Maoguo Gong^1,3

643 Accesses
4 Citations
1 Altmetric
Explore all metrics

Abstract

Hashing methods have been proposed for the cross-modal retrieval tasks due to their flexibility and effectiveness. The main idea of cross-modal hashing is to embed heterogeneous multimedia data into common Hamming space. How to effectively exploit the modal semantic information and reduce optimization loss have been a challenging problem for existing cross-modal hashing methods. To address these issues, we propose a supervised cross-modal hashing method, called Latent Semantic-Enhanced discrete Hashing (LSEH). LSEH first leverages matrix factorization to obtain individual latent semantic representations of different modalities, and then applies correlation analysis and kernel discriminant analysis when projecting the latent semantic representations into the common Hamming space. Finally, the binary codes are directly generated with discrete optimization strategy. Experimental results on four benchmark datasets demonstrate that LSEH outperforms state-of-the-art cross-modal hashing methods in terms of retrieval accuracy, especially when dealing with image to text retrieval task, using shorter hash codes to associate images and texts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Discrete Hashing Based Supervised Matrix Factorization for Cross-Modal Retrieval

Discrete Similarity Preserving Hashing for Cross-modal Retrieval

Semantic Preservation and Hash Fusion Network for Unsupervised Cross-Modal Retrieval

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Peng Y, Huang X, Zhao Y (2018) An Overview of Cross-Media Retrieval: Concepts, Methodologies, Benchmarks, and Challenges. IEEE Transactions on Circuits and Systems for Video Technology 28(9):2372–2385
Article Google Scholar
M Müller, Arzt, A., Balke, S., Dorfer, M., & Widmer, G. (2019) Cross-modal music retrieval and applications: an overview of key methodologies. IEEE Signal Processing Magazine 36(1):52–62
Liu, H, Feng, Y, Zhou, M, & Qiang, B (2020). Semantic ranking structure preserving for cross-modal retrieval. Applied Intelligence, 1-11
Djenouri Y, Belhadi A, Fournier-Viger P, Lin CW (2018) Fast and effective cluster-based information retrieval using frequent closed itemsets. Information Sciences 453:154–167
Article MathSciNet MATH Google Scholar
Djenouri, Y, Belhadi, A, Djenouri, D, & Lin, CW (2021). Cluster-based information retrieval using pattern mining. Applied Intelligence, 51, 1888–1909
Djenouri, Y, & Hjelmervik, J (2020). Hybrid Decomposition Convolution Neural Network and Vocabulary Forest for Image Retrieval. 25th International Conference on Pattern Recognition, 3064-3070
Yu E, Sun J, Li J, Chang X, Han X, Hauptmann A (2018) Adaptive semi-supervised feature selection for cross-modal retrieval. IEEE Transactions on Multimedia 21(5):1276–1288
Article Google Scholar
Yan J, Zhang H, Sun J, Wang Q, Guo P, Meng L, Dong X (2018) Joint graph regularization based modality-dependent cross-media retrieval. Multimedia Tools and Applications 77(3):3009–3027
Article Google Scholar
Wang K, He R, Wang L, Wang W, Tan T (2016) Joint Feature Selection and Subspace Learning for Cross-Modal Retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 38(10):2010–2023
Article Google Scholar
Wu, J, Xie, X, Nie, L, & Lin, Z (2021). Reconstruction regularized low-rank subspace learning for cross-modal retrieval. Pattern Recognit, 113
Yao, T, Wang, G, Yan, L, Kong, X, Su, Q, Zhang, C, & Tian, Q (2019). Online Latent Semantic Hashing for Cross-media Retrieval. Pattern Recognition, pp, 1-11
Liong, VE, Lu, J, & Tan, Y (2018). Cross-Modal Discrete Hashing. Pattern Recognition, pp, 114-129
Ding, G, Guo, Y, & Zhou, J (2014). Collective Matrix Factorization Hashing for Multimodal Data. Computer Vision Pattern Recognition, pp, 2075-2082
Kumar S, Udupa R (2011). Learning Hash Functions for CrossView Similarity Search. International Joint Conference on Artificial Intelligence. AAAI Pres, pp 1360–1365
Zhang, D, & Li, W (2014). Large-scale supervised multimodal hashing with semantic correlation maximization. National Conference on Artificial Intelligence, pp, 2177-2183
Zhou, J, Ding, G, & Guo, Y (2014). Latent semantic sparse hashing for cross-modal similarity search. International ACM SIGIR Conferenceon Research and Development in Information Retrieval, pp 415-424
Ma D, Liang J, He R, Kong X (2017) Nonlinear Discrete Cross-Modal Hashing for Visual-Textual Data. IEEE MultiMedia 24(2):56–65
Article Google Scholar
Zhen, Y, & Yeung, D (2012). Co-Regularized Hashing for Multimodal Data. Neural Information Processing Systems, pp 1376-1384
Liu, H, Ji, R, Wu, Y, Huang, F, & Zhang, B (2017). Cross-Modality Binary Code Learning via Fusion Similarity Hashing. Computer Vision Pattern Recognition, pp 7380–7388
Lin, Z, Ding, G, Hu, M, & Wang, J (2015). Semantics-preserving hashing for cross-view retrieval. IEEE Conference on Computer Vision Pattern Recognition, pp 3864-3872
Fang, Y, & Ren, Y (2020). Supervised discrete cross-modal hashing based on kernel discriminant analysis. Pattern Recognition, 98
Wang D, Gao X, Wang X (2018) Label Consistent Matrix Factorization Hashing for Large-Scale CrossModal Similarity Search. IEEE Transactions on Pattern Analysis and Machine Intelligence 41(10):2466–2479
Article Google Scholar
Liu, H, Ji, R, Wu, Y, & Hua, G (2016). Supervised matrix factorization for cross-modality hashing. In Proceedings of the International Joint Conference on Artificial Intelligence, pp 1767–1773
Tang J, Wang K, Shao L (2016) Supervised Matrix Factorization Hashing for Cross-Modal Retrieval. IEEE Transactions on Image Processing 25(7):3157–3166
Article MathSciNet MATH Google Scholar
Lu X, Zhu L, Cheng Z (2019) Efficient discrete latent semantic hashing for scalable cross-modal retrieval. Signal processing 154:217–231
Article Google Scholar
Fang, Y, Zhang, H, & Ren, Y (2019). Unsupervised cross-modal retrieval via Multi-modal graph regularized Smooth Matrix Factorization Hashing. Knowledge Based Systems, pp 69-80
Zeng, H, Zhang, H, & Zhu, L (2019). Label consistent locally linear embedding based cross-modal hashing. Information Processing and Management
Yao, T, Kong, X, Fu, H, & Tian, Q (2016). Semantic consistency hashing for cross-modal retrieval. Neurocomputing, pp 250-259
Dong, F, Nie, X, Liu, X, Geng, L, & Wang, Q (2018). Cross-modal hashing based on category structure preserving. Journal of Visual Communication and Image Representation, pp 28-33
Zheng, C, Zhu, L, Zhang, S, & Zhang, H (2020). Efficient parameter-free adaptive multi-modal hashing. IEEE Signal Processing Letters, PP(99), 1-1
Jiang, QY, & Li, WJ (2017). Deep cross-modal hashing. Computer Vision Pattern Recognition, pp 3270-3278
Zhong, F, Chen, Z, & Min, G (2018). Deep Discrete Cross-Modal Hashing for Cross-Media Retrieval. Pattern Recognition, pp 64-77
Cai D, He X, Han J (2011) Speed up kernel discriminant analysis. Vldb Journal 20(1):21–33
Article Google Scholar
Rasiwasia, N, Pereira, JC, Coviello, E, Doyle, G, Lanckriet, GR, Levy, R, & Vasconcelos, N (2010). A new approach to cross-modal multimedia retrieval. Acm Multimedia, pp 251-260
Russell BC, Torralba A, Murphy K, Freeman WT (2008) LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision 77(1):157–173
Article Google Scholar
Hwang SJ, Grauman K (2012) Reading between the lines: Object localization using implicit cues from image tags. Computer Vision Pattern Recognition 34(6):1145–1158
Google Scholar
Krapac, J, Allan, M, & Verbeek, J (2010). Improving web image search results using query-relative classifiers. Computer Vision Pattern Recognition, pp 1094–1101
Wei Y, Zhao Y, Lu C, Wei S, Liu L, Zhu Z et al (2017) Cross-modal retrieval with cnn visual features: a new baseline. IEEE Transactions on Cybernetics 47(2):449–460
Google Scholar

Download references

Acknowledgements

This paper is supported by the Natural Science Foundation of China (71772107, 62072288), the Natural Science Foundation of Shandong Province of China (ZR2020MF044, ZR202102230289, ZR2019MF003, ZR2021MF104), Shandong Education Quality Improvement Plan for Postgraduate (2021), the SDUST Research Fund, Humanity and Social Science Fund of the Ministry of Education under Grant 20YJAZH078 and 20YJAZH127.

Author information

Authors and Affiliations

Key Laboratory for Wisdom Mine Information Technology of Shandong Province, Shandong University of Science and Technology, Qingdao, 266590, Shandong, China
Yun Liu, Shujuan Ji, Jianli Zhao, Zhongying Zhao & Maoguo Gong
Beijing Key Lab of Human-Computer Interaction, Institute of Software, Chinese Academy of Sciences, Beijing, 100190, China
Qiang Fu
Key Laboratory of Intelligent Perception and Image Understanding, Xidian University, PO Box 224, Xi’an, 710071, Shanxi, China
Maoguo Gong

Authors

Yun Liu
View author publications
You can also search for this author inPubMed Google Scholar
Shujuan Ji
View author publications
You can also search for this author inPubMed Google Scholar
Qiang Fu
View author publications
You can also search for this author inPubMed Google Scholar
Jianli Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Zhongying Zhao
View author publications
You can also search for this author inPubMed Google Scholar
Maoguo Gong
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Shujuan Ji or Maoguo Gong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Y., Ji, S., Fu, Q. et al. Latent semantic-enhanced discrete hashing for cross-modal retrieval. Appl Intell 52, 16004–16020 (2022). https://doi.org/10.1007/s10489-021-03143-2

Download citation

Accepted: 23 December 2021
Published: 19 March 2022
Issue Date: November 2022
DOI: https://doi.org/10.1007/s10489-021-03143-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Latent semantic-enhanced discrete hashing for cross-modal retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Discrete Hashing Based Supervised Matrix Factorization for Cross-Modal Retrieval

Discrete Similarity Preserving Hashing for Cross-modal Retrieval

Semantic Preservation and Hash Fusion Network for Unsupervised Cross-Modal Retrieval

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now