Skip to main content
Log in

Semi-supervised semantic factorization hashing for fast cross-modal retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Cross-modal hashing can effectively solve the large-scale cross-modal retrieval by integrating the advantages of traditional cross-modal analysis and hashing techniques. In cross-modal hashing, preserving semantic correlation is important and challenging. However, current hashing methods cannot well preserve the semantic correlation in hash codes. Supervised hashing requires labeled data which is difficult to obtain, and unsupervised hashing cannot effectively learn semantic correlation from multi-modal data. In order to effectively learn semantic correlation to improve hashing performance, we propose a novel approach: Semi-Supervised Semantic Factorization Hashing (S3FH), for large-scale cross-modal retrieval. The main purpose of S3FH is to improve semantic labels and factorize it into hash codes. It optimizes a joint framework which consists of three interactive parts, including semantic factorization, multi-graph learning and multi-modal correlation. Then, an efficient alternating algorithm is derived for optimizing S3FH. Extensive experiments on two real world multi-modal datasets demonstrate the effectiveness of S3FH.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Bronstein MM, Bronstein AM, Michel F, Paragios N (2010) Data fusion through cross-modality metric learning using similarity-sensitive hashing. In: CVPR. IEEE, pp 3594–3601

  2. Cheng J, Leng C, Li P, Wang M, Lu H (2014) Semi-supervised multi-graph hashing for scalable similarity search. Comput Vis Image Underst 124:12–21

    Article  Google Scholar 

  3. Chua TS, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) Nus-wide: a real-world web image database from National University of Singapore. In: CIVR. ACM, p 48

  4. Costa Pereira J, Coviello E, Doyle G, Rasiwasia N, Lanckriet GR, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 36(3):521–535

    Article  Google Scholar 

  5. Ding G, Guo Y, Zhou J (2014) Collective matrix factorization hashing for multimodal data. In: CVPR. IEEE, pp 2083–2090

  6. Gan J, Feng J, Fang Q, Ng W (2012) Locality-sensitive hashing scheme based on dynamic collision counting. In: SIGMOD. ACM, pp 541–552

  7. Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: a procrustean approach to learning binary codes for large-scale image retrieval. IEEE Transactions on Pattern Analysis and Machine Intelligence 35(12):2916–2929

    Article  Google Scholar 

  8. Guillaumin M, Verbeek J, Schmid C (2010) Multimodal semi-supervised learning for image classification. In: 2010 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 902–909

  9. Kan M, Xu D, Shan S, Chen X (2014) Semisupervised hashing via kernel hyperplane learning for scalable image search. IEEE Transactions on Circuits and Systems for Video Technology 24(4):704–713

    Article  Google Scholar 

  10. Kong W, Li WJ (2012) Isotropic hashing. In: NIPS. pp 1646–1654

  11. Kumar S, Udupa R (2011) Learning hash functions for cross-view similarity search. In: IJCAI, vol. 22. pp 1360

  12. Liu W, He J, Chang SF (2010) Large graph construction for scalable semi-supervised learning. In: ICML. pp 679–686

  13. Luo Y, Tao D, Geng B, Xu C, Maybank SJ (2013) Manifold regularized multitask learning for semi-supervised multilabel image classification. IEEE Trans Image Process 22(2):523–536

    Article  MathSciNet  Google Scholar 

  14. Murty KG, Yu FT (1988) Linear complementarity, linear and nonlinear programming. Citeseer

  15. Pan Y, Yao T, Li H, Ngo CW, Mei T (2015) Semi-supervised hashing with semantic confidence for large scale visual search. In: The international ACM SIGIR conference. pp 53–62

  16. Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: ACM multimedia. ACM, pp 251–260

  17. Schölkopf B, Smola A, Müller KR (1998) Nonlinear component analysis as a kernel eigenvalue problem. Neural Computation 10(5):1299–1319

    Article  Google Scholar 

  18. Song J, Yang Y, Yang Y, Huang Z, Shen HT (2013) Inter-media hashing for large-scale retrieval from heterogeneous data sources. In: SIGMOD. ACM, pp 785–796

  19. Wang M, Hua XS, Hong R, Tang J, Qi GJ, Song Y (2009) Unified video annotation via multigraph learning. IEEE Transactions on Circuits and Systems for Video Technology 19(5):733–746

    Article  Google Scholar 

  20. Wang J, Kumar S, Chang SF (2010) Semi-supervised hashing for scalable image retrieval. In: IEEE conference on computer vision and pattern recognition. pp 3424–3431

  21. Wang J, Kumar S, Chang SF (2012a) Semi-supervised hashing for large-scale search. IEEE Transactions on Pattern Analysis and Machine Intelligence 34 (12):2393–2406

  22. Wang M, Li H, Tao D, Lu K, Wu X (2012b) Multimodal graph-based reranking for web image search. IEEE Trans Image Process 21(11):4649–4661

  23. Weiss Y, Torralba A, Fergus R (2009) Spectral hashing. In: NIPS. pp 1753–1760

  24. Xie L, Zhu L, Pan P, Lu Y (2016) Cross-modal self-taught hashing for large-scale image retrieval. Signal Process 124:81–92

    Article  Google Scholar 

  25. Yang Y, Xu D, Nie F, Luo J, Zhuang Y (2009) Ranking with local regression and global alignment for cross media retrieval. In: ACM multimedia. ACM, pp 175–184

  26. Zhai X, Peng Y, Xiao J (2013) Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In: AAAI

  27. Zhang D, Li WJ (2014) Large-scale supervised multimodal hashing with semantic correlation maximization. In: AAAI

  28. Zhang D, Wang J, Cai D, Lu J (2010) Self-taught hashing for fast similarity search. In: SIGIR. ACM, pp 18–25

  29. Zhang D, Wang F, Si L (2011) Composite hashing with multiple information sources. In: SIGIR. ACM, pp 225–234

  30. Zhao S, Yao H, Yang Y, Zhang Y (2014) Affective image retrieval via multi-graph learning. pp 1025–1028

  31. Zhen Y, Yeung DY (2012) A probabilistic model for multimodal hash function learning. In: SIGKDD. ACM, pp 940–948

  32. Zhu X, Lafferty J, Rosenfeld R (2005) Semi-supervised learning with graphs. PhD thesis, Carnegie Mellon University, Language Technologies Institute, School of Computer Science

  33. Zhu X, Huang Z, Shen HT, Zhao X (2013) Linear cross-modal hashing for efficient multimedia search. In: ACM multimedia. ACM, pp 143–152

  34. Zhu L, Shen J, Jin H, Zheng R, Xie L (2015) Content-based visual landmark search via multimodal hypergraph learning. IEEE Transactions on Cybernetics 45 (12):2756–2769

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiale Wang.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, J., Li, G., Pan, P. et al. Semi-supervised semantic factorization hashing for fast cross-modal retrieval. Multimed Tools Appl 76, 20197–20215 (2017). https://doi.org/10.1007/s11042-017-4567-3

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4567-3

Keywords

Navigation