Fast unsupervised consistent and modality-specific hashing for multimedia retrieval

Yang, Zhan; Deng, Xiyin; Long, Jun

doi:10.1007/s00521-022-08008-4

Fast unsupervised consistent and modality-specific hashing for multimedia retrieval

Original Article
Published: 17 November 2022

Volume 35, pages 6207–6223, (2023)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Zhan Yang^1,2,3,
Xiyin Deng^1,2 &
Jun Long^1,2,3

428 Accesses
Explore all metrics

Abstract

Hashing is an effective technique to solve large-scale data storage problem and achieve efficient retrieval, and it is also a core technology to promote the intelligent development of the new infrastructure construction. In most practical situations, label information is unavailable, and creating manual annotations is a time-consuming and laborious process. Therefore, unsupervised cross-modal hashing technique has received extensive attention from the information retrieval community due to its fast retrieval speed and feasibility. However, the capabilities of existing unsupervised cross-modal hashing methods are not sufficient to comprehensively describe the complex relations among different modalities, such as the balance of complementary and consistency between different modalities. In this article, we propose a new-type of unsupervised cross-modal hashing method called Fast Unsupervised Consistent and Modality-Specific Hashing (FUCMSH). Specifically, FUCMSH consists of two main modules, i.e., shared matrix factorization module (SMFM) and individual auto-encoding module (IAEM). In the SMFM, FUCMSH dynamically assigns weights to different modalities to adaptively balance the contribution of different modalities. By doing so, the information completeness of the shared consistent representation can be guaranteed. In the IAEM, FUCMSH learns individual modality-specific latent representations of different modalities through modality-specific linear autoencoders. Moreover, FUCMSH makes use of the transfer learning to link the relationships between different individual modality-specific latent representations. Combined with the SMFM and the IAEM, the discriminative capability of the generated binary codes can be significantly improved. The relatively extensive experimental results manifest the superiority of the proposed FUCMSH.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised discrete hashing for efficient cross-modal retrieval

Article 01 July 2020

Robust supervised matrix factorization hashing with application to cross-modal retrieval

Article 27 November 2022

Completely Unsupervised Cross-Modal Hashing

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Data Availability Statement

This publication is supported by multiple datasets, which are openly available at the hyperlinks in the dataset section or at the locations cited in the reference section.

Notes

References

Gao D, Jin L, Chen B, Qiu M, Li P, Wei Y, Hu Y, Wang H (2020) Fashionbert: text and image matching with adaptive loss for cross-modal retrieval. In: ACM SIGIR. ACM, pp 2251–2260
Lin K, Xu X, Gao L, Wang Z, Shen HT (2020) Learning cross-aligned latent embeddings for zero-shot cross-modal retrieval. In: AAAI. AAAI Press, pp 11515–11522
Wang B, Yang Y, Xu X, Hanjalic A, Shen HT (2017) Adversarial cross-modal retrieval. In: ACM MM. ACM, pp 154–162
Wu Y, Wang S, Huang Q (2020) Online fast adaptive low-rank similarity learning for cross-modal retrieval. IEEE Trans Multimed 22(5):1310–1322
Article Google Scholar
Zhang Y, Zhou W, Wang M, Tian Q, Li H (2021) Deep relation embedding for cross-modal retrieval. IEEE Trans Image Process 30:617–627
Article Google Scholar
Wang Z, Zhang Z, Luo Y, Huang Z, Shen HT (2021) Deep collaborative discrete hashing with semantic-invariant structure construction. IEEE Trans Multimed 23:1274–1286
Article Google Scholar
Qiang H, Wan Y, Liu Z, Xiang L, Meng X (2020) Discriminative deep asymmetric supervised hashing for cross-modal retrieval. Knowl Based Syst 204:106188
Article Google Scholar
Yang Z, Long J, Zhu L, Huang W (2020) Nonlinear robust discrete hashing for cross-modal retrieval. In: ACM SIGIR, pp 1349–1358
Li Z, Tang J, Zhang L, Yang J (2020) Weakly-supervised semantic guided hashing for social image retrieval. Int J Comput Vis 128(8):2265–2278
Article MathSciNet MATH Google Scholar
Fang Y, Li B, Li X, Ren Y (2021) Unsupervised cross-modal similarity via latent structure discrete hashing factorization. Knowl Based Syst 218:106857
Article Google Scholar
Mandal D, Chaudhury KN, Biswas S (2019) Generalized semantic preserving hashing for cross-modal retrieval. TIP 28(1):102–112
MathSciNet Google Scholar
Bronstein MM, Bronstein AM, Michel F, Paragios N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR, pp 3594–3601
Liu X, Nie X, Zeng W, Cui C, Zhu L, Yin Y (2018) Fast discrete cross-modal hashing with regressing from semantic labels. In: ACM MM, pp 1662–1669
Shen F, Shen C, Liu W, Shen HT (2015) Supervised discrete hashing. In: IEEE conference on computer vision and pattern recognition, CVPR 2015, Boston, MA, USA, June 7–12, 2015, pp 37–45
Luo X, Zhang P, Wu Y, Chen Z, Huang H, Xu X (2018) Asymmetric discrete cross-modal hashing. In: ICMR, pp 204–212
Liu H, Wang R, Shan S, Chen X (2016) Deep supervised hashing for fast image retrieval. In: CVPR, pp 2064–2072
Yang Z, Raymond OI, Huang W, Liao Z, Zhu L, Long J (2020) Scalable deep asymmetric hashing via unequal-dimensional embeddings for image similarity search. Neurocomputing 412:262–275
Article Google Scholar
Li F, Wang T, Zhu L, Zhang Z, Wang X (2021) Task-adaptive asymmetric deep cross-modal hashing. Knowl Based Syst 219:106851
Article Google Scholar
Deng C, Yang E, Liu T, Li J, Liu W, Tao D (2019) Unsupervised semantic-preserving adversarial hashing for image search. IEEE Trans Image Process 28(8):4032–4044
Article MathSciNet MATH Google Scholar
Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD, pp 785–796
Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: SIGIR, pp 415–424
He K, Wen F, Sun J (2013) K-means hashing: an affinity-preserving quantization method for learning binary compact codes. In: CVPR, pp 2938–2945
Shen F, Xu Y, Liu L, Yang Y, Huang Z, Shen HT (2018) Unsupervised deep hashing with similarity-adaptive and discrete optimization. IEEE Trans Pattern Anal Mach Intell 40(12):3034–3044
Article Google Scholar
Weiss Y, Torralba A, Fergus R (2008) Spectral hashing. In: NIPS, pp 1753–1760
Zhang H, Liu L, Long Y, Shao L (2018) Unsupervised deep hashing with pseudo labels for scalable image retrieval. IEEE Trans Image Process 27(4):1626–1638
Article MathSciNet Google Scholar
Fang Y, Zhang H, Ren Y (2019) Unsupervised cross-modal retrieval via multi-modal graph regularized smooth matrix factorization hashing. Knowl Based Syst 171:69–80
Article Google Scholar
Yu J, Zhou H, Zhan Y, Tao D (2021) Deep graph-neighbor coherence preserving network for unsupervised cross-modal hashing. In: AAAI. AAAI Press, pp 4626–4634
Zhang D, Li W (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI, pp 2177–2183
Xu X, Shen F, Yang Y, Shen HT, Li X (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. TIP 26(5):2494–2507
MathSciNet MATH Google Scholar
Lin Z, Ding G, Hu M, Wang J (2015) Semantics-preserving hashing for cross-view retrieval. In: CVPR, pp 3864–3872
Kim S, Choi S (2013) Multi-view anchor graph hashing. In: ICASSP. IEEE, pp 3123–3127
Meng M, Wang H, Yu J, Chen H, Wu J (2021) Asymmetric supervised consistent and specific hashing for cross-modal retrieval. IEEE Trans Image Process 30:986–1000
Article MathSciNet Google Scholar
Sun L, Ji S, Ye J (2008) A least squares formulation for canonical correlation analysis. In: ICML, ACM International Conference Proceeding Series, vol 307, pp 1024–1031
Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: IJCAI, pp 1360–1365
Lee K, Chen X, Hua G, Hu H, He X (2018) Stacked cross attention for image-text matching. In: Ferrari V, Hebert M, Sminchisescu C, Weiss Y (eds) ECCV, Lecture notes in computer science, vol 11208. Springer, Berlin, pp 212–228
Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: CVPR, pp 2083–2090
Wang D, Wang Q, He L, Gao X, Tian Y (2020) Joint and individual matrix factorization hashing for large-scale cross-modal retrieval. Pattern Recognit 107:107479
Article Google Scholar
Cheng M, Jing L, Ng MK (2020) Robust unsupervised cross-modal hashing for multimedia retrieval. ACM Trans Inf Syst 38(3):30:1-30:25
Article Google Scholar
Wang L, Yang J, Zareapoor M, Zheng Z (2021) Cluster-wise unsupervised hashing for cross-modal similarity search. Pattern Recognit 111:107732
Article Google Scholar
Ji D, Gao J, Fei H, Teng C, Ren Y (2020) A deep neural network model for speakers coreference resolution in legal texts. Inf Process Manag 57(6):102365
Article Google Scholar
Farrugia RA, Guillemot C (2020) Light field super-resolution using a low-rank prior and deep convolutional neural networks. IEEE Trans Pattern Anal Mach Intell 42(5):1162–1175
Google Scholar
Zhang C, Liu A, Liu X, Xu Y, Yu H, Ma Y, Li T (2021) Interpreting and improving adversarial robustness of deep neural networks with neuron sensitivity. IEEE Trans Image Process 30:1291–1304
Article Google Scholar
Fu X, Wang W, Huang Y, Ding X, Paisley JW (2021) Deep multiscale detail networks for multiband spectral image sharpening. IEEE Trans Neural Netw Learn Syst 32(5):2090–2104
Article Google Scholar
Yang E, Deng C, Liu W, Liu X, Tao D, Gao X (2017) Pairwise relationship guided deep hashing for cross-modal retrieval. In: AAAI, pp 1618–1625
Jiang Q, Li W (2017) Deep cross-modal hashing. In: CVPR, pp 3270–3278
Wu G, Lin Z, Han J, Liu L, Ding G, Zhang B, Shen J (2018) Unsupervised deep hashing via binary latent factor models for large-scale cross-modal retrieval. In: IJCAI, pp 2854–2860
Zhu L, Lu X, Cheng Z, Li J, Zhang H (2020) Deep collaborative multi-view hashing for large-scale image search. IEEE Trans Image Process 29:4643–4655
Article MathSciNet MATH Google Scholar
Zhang J, Peng Y, Yuan M (2018) Unsupervised generative adversarial cross-modal hashing. In: AAAI, pp 539–546
Shao M, Kit D, Fu Y (2014) Generalized transfer subspace learning through low-rank constraint. Int J Comput Vis 109(1–2):74–93
Article MathSciNet MATH Google Scholar
Kafai M, Eshghi K (2019) Croification: Accurate kernel classification with the efficiency of sparse linear SVM. IEEE Trans Pattern Anal Mach Intell 41(1):34–48
Article Google Scholar
LeCun Y, Bengio Y, Hinton G (2015) Deep learning. Nature 521(7553):436–444
Article Google Scholar
Rasiwasia N, Pereira JC, Coviello E, Doyle G, Lanckriet GRG, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: ACM MM. ACM, pp 251–260
Huiskes M J, Lew M S (2008) The MIR flickr retrieval evaluation. In: ACM SIGMM, pp 39–43
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. Iin: Proceedings of the 8th ACM international conference on image and video retrieval, CIVR 2009, Santorini Island, Greece, July 8–10, 2009
Chen Z, Wang Y, Li H, Luo X, Nie L, Xu X (2019) A two-step cross-modal hashing by exploiting label correlations and preserving similarity in both steps. In: ACM MM, pp 1694–1702
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: ICLR
Deng J, Dong W, Socher R, Li R, Li K, Li F (2009) Imagenet: a large-scale hierarchical image database. In: CVPR, pp 248–255
Liu H, Lin M, Zhang S, Wu Y, Huang F, Ji R (2018) Dense auto-encoder hashing for robust cross-modality retrieval. In: ACM MM. ACM, pp 1589–1597
Zheng C, Zhu L, Cheng Z, Li J, Liu A (2021) Adaptive partial multi-view hashing for efficient social image retrieval. IEEE Trans Multimed 23:4079–4092
Article Google Scholar

Download references

Acknowledgements

This work was supported in part by the National Key R &D Program of China under Grant 2021YFB3900902, in part by the National Natural Science Foundation of China under Grants (62202501, U2003208), and in part by the Science and Technology Plan of Hunan Province under Grants (2022JJ40638, 2016TP1003).

Author information

Authors and Affiliations

School of Computer Science and Engineering, Central South University, Changsha, 410083, China
Zhan Yang, Xiyin Deng & Jun Long
Network Resources Management and Trust Evaluation Key Laboratory of Hunan Province, Changsha, 410083, China
Zhan Yang, Xiyin Deng & Jun Long
Big Data Institute, Central South University, Changsha, 410083, China
Zhan Yang & Jun Long

Authors

Zhan Yang
View author publications
You can also search for this author inPubMed Google Scholar
Xiyin Deng
View author publications
You can also search for this author inPubMed Google Scholar
Jun Long
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Jun Long.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yang, Z., Deng, X. & Long, J. Fast unsupervised consistent and modality-specific hashing for multimedia retrieval. Neural Comput & Applic 35, 6207–6223 (2023). https://doi.org/10.1007/s00521-022-08008-4

Download citation

Received: 08 January 2022
Accepted: 26 October 2022
Published: 17 November 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s00521-022-08008-4

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast unsupervised consistent and modality-specific hashing for multimedia retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semi-supervised discrete hashing for efficient cross-modal retrieval

Robust supervised matrix factorization hashing with application to cross-modal retrieval

Completely Unsupervised Cross-Modal Hashing

Explore related subjects

Data Availability Statement

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now