Skip to main content
Log in

Joint graph regularization based modality-dependent cross-media retrieval

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Cross-media retrieval returns heterogeneous multimedia data of the same semantics for a query object, and the key problem for cross-media retrieval is how to deal with the correlations of heterogeneous multimedia data. Many works focus on mapping different modal data into an isomorphic space, so the similarities between different modal data can be measured. Inspired by this idea, we propose a joint graph regularization based modality-dependent cross-media retrieval approach (JGRMDCR), which takes into account the one-to-one correspondence between different modal data pairs, the inter-modality similarities and the intra-modality similarities. Meanwhile, according to the modality of the query object, this method learns different projection matrices for different retrieval tasks. Experimental results on benchmark datasets show that the proposed approach outperforms the other state-of-the-art algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  1. André B, Vercauteren T, Buchner AM, Wallace MB, Ayache N (2012) Learning semantic and visual similarity for endomicroscopy video retrieval. IEEE Trans Med Imaging 31(6):1276–1288

    Article  Google Scholar 

  2. Chang X, Nie F, Wang S, Yang Y, Zhou X, Zhang C (2016) Compound rank-k projections for bilinear analysis. IEEE Trans Neural Netw Learn Syst 27(7):1502–1513

    Article  MathSciNet  Google Scholar 

  3. Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection, Twenty-Eighth AAAI Conference on Artificial Intelligence, the Twenty-Sixth Innovative Applications of Artificial Intelligence Conference, the Symposium on Educational Advances in Artificial Intelligence, 2, 1171–1177

  4. Chang X, Yang Y (2016) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst. doi:10.1109/TNNLS.2016.2582746

    Article  MathSciNet  Google Scholar 

  5. Chang X, Yu YL, Yang Y, Xing EP (2016) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell PP(99):1–1. doi:10.1109/TPAMI.2016.2608901

    Google Scholar 

  6. Escalante HJ, Hérnadez CA, Sucar LE, Montes M (2008) Late fusion of heterogeneous methods for multimedia image retrieval. In: ACM Sigmm international conference on multimedia information retrieval, pp 172–179

  7. Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106 (2):210–233

    Article  Google Scholar 

  8. Haiduc S, Bavota G, Marcus A, Oliveto R, Lucia AD, Menzies T (2013) Automatic query reformulations for text retrieval in software engineering. Int Conf Softw Eng 8114:842–851

    Google Scholar 

  9. Hardoon DR, Szedmak S, Shawetaylor J (2004) Canonical correlation analysis: An overview with application to learning methods. Neural Comput 16(12):2639–2664

    Article  Google Scholar 

  10. Hu P, Liu W, Jiang W, Yang Z (2014) Latent topic model for audio retrieval. Pattern Recogn 47(3):1138–1143

    Article  Google Scholar 

  11. Lew MS, Sebe N, Djeraba C, Jain R (2006) Content-based multimedia information retrieval: State of the art and challenges. ACM Trans Multimed Comput Commun Appl 2(1):1–19

    Article  Google Scholar 

  12. Li D, Dimitrova N, Li M, Sethi IK (2003) Multimedia content processing through cross-modal association. In: Eleventh ACM International conference on multimedia, pp 604–611

  13. Lin W, Lu T, Su F (2012) A novel multi-modal integration and propagation model for cross-media information retrieval. Int Conf Multimed Model 7131:740–749

    Google Scholar 

  14. Liu GH, Yang JY (2013) Content-based image retrieval using color difference histogram. Pattern Recogn 46(1):188–198

    Article  Google Scholar 

  15. Nie X, Yin Y, Liu J, Sun J, Cui C (2017) Comprehensive feature-based robust video fingerprinting using tensor model. IEEE Trans Multimed 19(4):785–796

    Article  Google Scholar 

  16. Peng Y, Zhai X, Zhao Y, Huang X (2015) Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE Trans Circ Syst Video Technol 26(3):1–1

    Google Scholar 

  17. Rasiwasia N, Costa Pereira J, Coviello E, Doyle G, Lanckriet GR, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: International conference on multimedia, pp 251–260

  18. Rasiwasia N, Mahajan D, Mahadevan V, Aggarwal G (2014) Cluster Canonical Correlation Analysis, Aistats, pp 823–831

  19. Sharma A, Kumar A, Daume H, Jacobs DW (2012) Generalized multiview analysis: A discriminative latent space. Comput Vis Pattern Recognit 157:2160–2167

    Google Scholar 

  20. Shehata S, Karray F, Kamel MS (2013) An efficient concept-based retrieval model for enhancing text retrieval quality. Knowl Inf Syst 5(2):411–434

    Article  Google Scholar 

  21. Singha M, Hemachandran K (2012) Content based image retrieval using color and texture. Signal Image Process Int J 3(1):271–273

    Google Scholar 

  22. Song W, Cui Y, Peng Z (2015) A full-text retrieval algorithm for encrypted data in cloud storage applications. In: National CCf conference on natural language processing and Chinese computing, pp 229–241

    Chapter  Google Scholar 

  23. Sun L, Ji S, Ye J (2011) Canonical correlation analysis for multilabel classification, A least-squares formulation, extensions, and analysis. IEEE Trans Pattern Anal Mach Intell 33(1):194–200

    Article  Google Scholar 

  24. Sun J, Liu X, Wan W, Li J, Zhao D, Zhang H (2016) Video hashing based on appearance and attention features fusion via DBN. Neurocomputing 213:84–94

    Article  Google Scholar 

  25. Virtanen S, Klami A, Kaski S (2011) Bayesian CCA via group sparsity. In: International conference on machine learning, pp 457–464

  26. Vitola CPJ, Sepúlveda J, Martínez JI (2013) Fast content-based audio retrieval algorithm. In: Symposium of signals, images and artificial vision, pp 1–5

  27. Wang Y, Zhang H, Yang F (2017) A weighted sparse neighbourhood-preserving projections for face recognition, IETE J Res, 1–10

  28. Wei Y, Zhao Y, Zhu Z, Wei S, Xiao Y, Feng J, Yan S (2015) Modality-dependent Cross-media Retrieval. ACM Trans Intell Syst Technol 7(4):57

    Google Scholar 

  29. Zhai X, Peng Y, Xiao J (2014) Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans Circ Syst Video Technol 24(6):965–978

    Article  Google Scholar 

  30. Zhang H, Cao L, Gao S (2014) A locality correlation preserving support vector machine. Pattern Recogn 47(9):3168–3178

    Article  Google Scholar 

  31. Zhang H, Liu Y, Ma Z (2013) Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval. Neurocomputing 119(16):10–16

    Article  Google Scholar 

  32. Zhang H, Liu X (2012) Cross-media semantics mining based on sparse canonical correlation analysis and relevance feedback. In: Advances in multimedia information processing - PCM 2012. Springer, Berlin Heidelberg, pp 759–768

    Chapter  Google Scholar 

  33. Zhang H, Lu J (2010) Creating ensembles of classifiers via fuzzy clustering and deflection. Fuzzy Sets Syst 161(13):1790–1802

    Article  MathSciNet  Google Scholar 

  34. Zhou P, Du L, Fan M, Shen YD (2015) An LLE based heterogeneous metric learning for cross-media retrieval. In: Proceedings of the 2015 SIAM international conference on data mining, pp 64–72

Download references

Acknowledgement

The work is partially supported by the National Natural Science Foundation of China (Nos. 61373081, 61572298, 61402268, 61401260, 61601268), the Key Research and Development Foundation of Shandong Province (No. 2016GGX101009) and the Natural Science Foundation of Shandong China (No.BS2014DX006, ZR2014FM012). We also gratefully acknowledge the support of NVIDIA Corporation with the donation of the TITAN X GPU used for this research.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Huaxiang Zhang or Jiande Sun.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yan, J., Zhang, H., Sun, J. et al. Joint graph regularization based modality-dependent cross-media retrieval. Multimed Tools Appl 77, 3009–3027 (2018). https://doi.org/10.1007/s11042-017-4918-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-017-4918-0

Keywords

Navigation