Abstract
In recent years, hashing methods have received extensive attention in multimedia search due to their high computational and storage efficiency. However, most of them explore the common representation of multi-modality data and then use it to generate the hash codes but ignore the specific properties of each modality. To mitigate this problem, we propose a novel hashing method, called Robust Supervised Matrix Factorization Hashing (RSMFH), which keeps both the shared and the specific properties of multimodality data by decomposing each modality into a common representation and an inconsistent representation. Moreover, we impose sparse constraints on the inconsistent part of each modality and minimize the production of the consistent parts, simultaneously. In addition, the supervised label information among the data is embedded into the learned hash codes enhancing the discriminative ability of RSMFH. We employ an efficient discrete optimization strategy to solve the proposed model. Massive experiments on four benchmark databases show that our approach achieves promising results in cross-modal retrieval tasks.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets analyzed during the current study are available in the LabelMe, UCL, Pascal sentences, Wiki repository http://labelme.csail.mit.edu/Release3.0/, https:// www.ucl.ac.uk/library, https://github.com/rupy/PascalSentenceDataset, http://www.svcl.ucsd.edu/projects/crossmodal/.
References
Yang E, Deng C, Liu W et al (2017) Pairwise relationship guided deep hashing for cross-modal retrieval. In: Proceedings of the AAAI conference on artificial intelligence, pp 1618–1625
Yang E, Deng C, Li C et al (2018) Shared predictive cross-modal deep quantization. In: IEEE transactions on neural networks and learning systems, pp 1–12
Shu Z, Li L, Yu J et al (2022) Online supervised collective matrix factorization hashing for cross-modal retrieval. In: Applied intelligence, pp 1–18
Shu Z, Yong K, Yu J et al (2022) Discrete asymmetric zero-shot hashing with application to cross-modal retrieval. In: Neurocomputing, pp 366–379
Zhang D, Wu X, Yin H et al (2022) MOON: multi-hash codes joint learning for cross-media retrieval. In: Pattern recognition letters, pp 19–25
Shu Z, Bai Y, Zhang D et al (2022) Specific class center guided deep hashing for cross-modal retrieval. In: Information sciences, pp 304–318
Deng C, Yang E, Liu T et al (2019) Unsupervised semantic-preserving adversarial hashing for image search. In: IEEE transactions on image processing, pp 4032–4044
Deng C, Yang E, Liu T et al (2019) Two-stream deep hashing with class-specific centers for supervised image search. In: IEEE transactions on neural networks and learning systems, pp 1–13
Yu J, Zhang D, Shu Z et al (2022) Adaptive multi-modal fusion hashing via Hadamard matrix. In: Applied intelligence, pp 1–15
Gionis A, Indyk P, Motwani R et al (1999) Similarity search in high dimensions via hashing. In: Proceedings of the 25th VLDB conference, pp 518–529
Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: Advances in neural information processing systems, pp 1753–1760
Zhu X, Huang Z, Cheng H et al (2013) Sparse hashing for fast multimedia search. In: ACM transactions on information systems, pp 1–24
Zhu X, Huang Z, Shen HT et al (2013) Linear cross-modal hashing for efficient multimedia search. In: ACM international conference on multimedia, pp 143–152
Song J, Yang Y, Yang Y et al (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: ACM international conference on multimedia of data, pp 785–796
Zheng F, Tang Y, Shao L (2018) Hetero-manifold regularization for cross-modal hashing. In: IEEE transactions on pattern analysis and machine intelligence, pp 1059–1071
Wu F, Yu Z, Yang Y et al (2014) Sparse multi-modal hashing. In: IEEE transactions on multimedia, pp 427–439
Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: IEEE conference on computer vision and pattern recognition, pp 2083–2090
Wang D, Gao X, Wang X et al (2015) Semantic topic multimodal hashing for cross-media retrieval. In: International joint conference on artificial intelligence, pp 3890–3896
Wang D, Wang Q, He L et al (2020) Joint and individual matrix factorization hashing for large- scale cross-modal retrieval. In: Pattern recognition, pp 1–12
Zhou J, Ding G, Guo Y (2014) Latent semantic sparse hashing for cross-modal similarity search. In: ACM SIGIR conference on research and development in information retrieval, pp 415–424
Wang D, Wang Q, Gao X (2018) Robust and flexible discrete hashing for cross–modal similarity search. In: IEEE transactions on circuits and systems for video technology, pp 2703–2715
Yao T, Li Y, Guan W et al (2021) Discrete robust matrix factorization hashing for large-scale cross-media retrieval. In: IEEE transactions on knowledge and data engineering, pp 1–12
Bronstein MM, Bronstein AM, Michel F et al (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: IEEE conference on computer vision and pattern recognition, pp 3594–3601
Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: International joint conference on artificial intelligence, pp 1360–1367
Zhang D, Li W-J (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: Proceedings of the AAAI conference on artificial intelligence, pp 2177–2183
Lin Z, Ding G, Hu M et al (2015) Semantics-preserving hashing for cross-view retrieval. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3864–3872
Tang J, Wang K, Shao L (2016) Supervised matrix factorization hashing for cross–modal retrieval. In: IEEE transactions on image processing, pp 3157–3166
Mandal D, Chaudhury KN, Biswas S (2017) Generalized semantic preserving hashing for n-label cross-modal retrieval. In: IEEE conference on computer vision and pattern recognition, pp 4076–4084
Wang D, Gao X, Wang X et al (2018) Label consistent matrix factorization hashing for large-scale cross-modal similarity search. In: IEEE transactions on pattern analysis and machine intelligence, pp 2466–2479
Xue F, Wang W, Zhou W et al (2020) Cross-modal retrieval via label category supervised matrix factorization hashing. In: Pattern recognition letters, pp 469–475
Zhang D, Wu X, Yu J (2021) Label consistent flexible matrix factorization hashing for efficient cross-modal retrieval. In: ACM transactions on multimedia computing communications and applications, pp 1–18
Obozinski G, Taskar B, Jordan MI (2010) Joint covariate selection and joint subspace selection for multiple classification problems. In: Statistics and computing, pp 231–252
Kong D, Huang H, Huang H (2011) Robust nonnegative matrix factorization using L2,1-norm. In: ACM international conference on information and knowledge management, pp 673–682
Lai Z, Chen Y, Wu J et al (2018) Jointly sparse hashing for image retrieval. In: IEEE transactions on image processing, pp 6147–6158
Li C-X, Chen Z-D, Zhang P-F et al (2018) SCRATCH: a scalable discrete matrix factorization hashing for cross-modal retrieval. In: ACM international conference on multimedia, pp 1–9
Shen F, Shen C, Liu W et al (2015) Supervised discrete hashing. In: IEEE conference on computer vision and pattern recognition, pp 37–45
Russell BC, Torralba A, Murphy KP et al (2008) LabelMe: a database and web-based tool for image annotation. In: International Journal of Computer Vision, pp 157–173
Seewald AK (2005) Digits–a dataset for handwritten digit recognition. In: Austrian research institut for artificial intelligence technical report, Vienna (Austria)
Rashtchian C, Young P, Hodosh M et al (2010) Collecting image annotations using amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 workshop on creating speech and language data with Amazon’s Mechanical Turk, pp 139–147
Rasiwasia N, Pereira J, Coviello E et al (2010) A new approach to cross-modal multimedia retrieval. In: ACM international conference on multimedia, pp 251–260
Rupnik J, Shawe-Taylor J (2010) Multi-view canonical correlation analysis. In: Proceedings of the conference on data mining and data warehouses, pp 1–4
Wang D, Wang Q, An Y et al (2020) Online collective matrix factorization hashing for large-scale cross-media retrieval. In: ACM SIGIR conference on research and development in information retrieval, pp 1409–1418
Xu X, Shen F et al (2017) Learning discriminative binary codes for large-scale cross-modal retrieval. In: IEEE transactions on image processing, pp 2494–2507
Baeza-Yates R, Ribeiro-Neto B (1999) Modern information retrieval. Addison Wesley
Funding
This work was supported by the National Natural Science Foundation of China [Grant Nos. 61603159, 62162033, U21B2027, U1836218], Yunnan Provincial Major Science and Technology Special Plan Projects [Grant Nos. 202002AD080001, 202103AA080015], Yunnan Foundation Research Projects [Grant Nos. 202101AT070438, 202101BE070001-056], Excellent Key Teachers of QingLan Project in Jiangsu Province.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and experimental analysis were performed by ZS, KY and DZ. The first draft of the manuscript was written by KY and ZS. JY, ZY and XJW commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
All authors declare that they have no conflicts of interest to this work.
Ethical approval
Our study did not involve animals.
Informed consent
Our study did not involve human participants.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shu, Z., Yong, K., Zhang, D. et al. Robust supervised matrix factorization hashing with application to cross-modal retrieval. Neural Comput & Applic 35, 6665–6684 (2023). https://doi.org/10.1007/s00521-022-08006-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-022-08006-6