Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval

Yu, En; Sun, Jiande; Wang, Li; Wan, Wenbo; Zhang, Huaxiang

doi:10.1007/s11042-018-5958-9

Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval

Published: 21 April 2018

Volume 78, pages 28931–28951, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

En Yu¹,
Jiande Sun¹,
Li Wang¹,
Wenbo Wan¹ &
…
Huaxiang Zhang^1,2

447 Accesses
3 Citations
Explore all metrics

Abstract

With the explosive growth of multimedia data, the information is usually represented in multi-modal version. The cross-modal based applications attracted increasing attention in recent years, and cross-modal retrieval is the popular one of them. In this paper, we propose a semi-supervised modality-dependent cross-modal retrieval method based on coupled feature selection (Semi-CoFe). It is different from most of the previous cross-modal retrieval methods, which usually used only labeled data for training to obtain the projection matrices under the constraint of l₂-norm. In details, we propagate the label of cluster centers to unlabeled data via a devised weight matrix and construct the pseudo corresponding heterogeneous data. And then we jointly considered the semantic regression and pair-wised correlation analysis when learning the mapping matrices to keep the semantic consistency and the closeness of pair-wised data. Meanwhile, the l_2,1-norm constraint is used for informative and discriminative features selection and noise reduction. In addition, we learn different mapping matrices for different sub-tasks (such as, using image to search text (I2T) and using text to search image (T2I)) to distinguish the semantic information of query data, and the optimal mapping matrices are achieved via an iterative optimization method. The experimental results on three public datasets verify that the proposed method performs better than the state-of-the-art methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised cross-modal learning for cross modal retrieval and image annotation

Article 13 July 2018

Cross-Modal Learning with Images, Texts and Their Semantics

Cross-modal retrieval with dual optimization

Article 17 August 2022

References

Blei DM, Ng AY, Jordan MI (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Chang X, Yang Y (2017) Semisupervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305
Article MathSciNet Google Scholar
Chang X, Ma Z, Lin M, Yang Y, Hauptmann A (2017) Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Trans Image Process 26(8):3911–3920
Article MathSciNet MATH Google Scholar
Chang X, Ma Z, Yang Y, Zeng Z, Hauptmann AG (2017) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans Cybern 47(5):1180–1197
Article Google Scholar
Chang X, Yu YL, Yang Y, Xing EP (2017) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
Article Google Scholar
Gong Y, Lazebnik S, Gordo A, Perronnin F (2013) Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval. IEEE Trans Pattern Anal Mach Intell 35(12):2916–2929
Article Google Scholar
Gong Y, Ke Q, Isard M, Lazebnik S (2014) A multi-view embedding space for modeling internet images, tags, and their semantics. Int J Comput Vis 106 (2):210–233
Article Google Scholar
Hardoon DR, Szedmak S, Shawetaylor J (2004) Canonical correlation analysis: an overview with application to learning methods. Neural Comput 16(12):2639–2664
Article MATH Google Scholar
He J, Ma B, Wang S, Liu Y, Huang Q (2016) Cross-modal retrieval by real label partial least squares. In: ACM on Multimedia Conference, pp 227–231
Hua Y, Wang S, Liu S, Huang Q, Cai A (2016) Cross-modal correlation learning by adaptive hierarchical semantic aggregation. IEEE Transactions on Multimedia 18(10):190–199
Jia Y, Salzmann M, Darrell T (2011) Learning cross-modality similarity for multinomial data. In: IEEE International Conference on Computer Vision, pp 2407–2414
Kang C, Xiang S, Liao S, Xu C, Pan C (2015) Learning consistent feature representation for cross-modal multimedia retrieval. IEEE Trans Multimed 17 (3):370–381
Article Google Scholar
Katsurai M, Ogawa T, Haseyama M (2014) A cross-modal approach for extracting semantic relationships between concepts using tagged images. IEEE Trans Multimed 16(4):1059–1074
Article Google Scholar
Li Z, Liu J, Tang J, Lu H (2015) Robust structured subspace learning for data representation. IEEE Trans Pattern Anal Mach Intell 37(10):2085–2098
Article Google Scholar
Mao X, Lin B, Cai D, He X, Pei J (2013) Parallel field alignment for cross media retrieval. In: ACM International Conference on Multimedia, pp 897–906
Nie F, Huang H, Cai X, Ding C (2010) Efficient and robust feature selection via joint 2,1 -norms minimization. In: International Conference on Neural Information Processing Systems, pp 1813–1821
Nie F, Wang H, Huang H, Ding C (2013) Early active learning via robust representation and structured sparsity. In: International Joint Conference on Artificial Intelligence, pp 1572–1578
Peng Y, Zhai X, Zhao Y, Huang X (2016) Semi-supervised cross-media feature learning with unified patch graph regularization. IEEE Trans Circ Syst Video Technol 26(3):583–596
Article Google Scholar
Pereira JC, Coviello E, Doyle G, Rasiwasia N, Lanckriet GRG, Levy R, Vasconcelos N (2014) On the role of correlation and abstraction in cross-modal multimedia retrieval. IEEE Trans Pattern Anal Mach Intell 36(3):521–35
Article Google Scholar
Rasiwasia N, Pereira JC, Coviello E, Doyle G, Lanckriet GRG, Levy R, Vasconcelos N (2010) A new approach to cross-modal multimedia retrieval. In: International Conference on Multimedia, pp 251–260
Sharma A (2012) Generalized multiview analysis: A discriminative latent space. In: IEEE Conference on Computer Vision and Pattern Recognition, pp 2160–2167
Sharma A, Jacobs DW (2011) Bypassing synthesis: Pls for face recognition with pose, low-resolution and sketch. In: Computer Vision and Pattern Recognition, pp 593–600
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
He R, Tan T, Wang L, Zheng W (2012) l2, 1 regularized correntropy for robust feature selection. IEEE Conf Comput Vis Pattern Recognit 157(10):2504–2511
Tenenbaum JB, Freeman WT (2014) Separating style and content with bilinear models. Neural Comput 12(6):1247–1283
Article Google Scholar
Wang J, Kumar S, Chang SF (2012) Semi-supervised hashing for large-scale search. IEEE Trans Pattern Anal Mach Intell 34(12):2393
Article Google Scholar
Wang K, He R, Wang W, Wang L, Tan T (2013) Learning coupled feature spaces for cross-modal matching. In: IEEE International Conference on Computer Vision, pp 2088–2095
Wang K, He R, Wang L, Wang W, Tan T (2016) Joint feature selection and subspace learning for cross-modal retrieval. IEEE Trans Pattern Anal Mach Intell 38(10):2010
Article Google Scholar
Wei Y, Zhao Y, Zhu Z, Wei S, Xiao Y, Feng J, Yan S (2016) Modality-dependent cross-media retrieval. Acm Trans Intell Syst Technol 7(4):57
Article Google Scholar
Wu F, Zhang H, Zhuang Y (2007) Learning semantic correlations for cross-media retrieval. In: IEEE International Conference on Image Processing, pp 1465–1468
Xie L, Pan P, Lu Y (2013) A semantic model for cross-modal and multi-modal retrieval. In: ACM Conference on International Conference on Multimedia Retrieval, pp 175–182
Yan J, Zhang H, Sun J, Wang Q, Guo P, Meng L, Wan W, Dong X (2018) Joint graph regularization based modality-dependent cross-media retrieval. Multimedia Tools and Applications 77(3):3009–3027
Article Google Scholar
Zhai X, Peng Y, Xiao J (2013) Heterogeneous metric learning with joint graph regularization for cross-media retrieval. In: Twenty-Seventh AAAI Conference on Artificial Intelligence, pp 1198–1204
Zhai X, Peng Y, Xiao J (2014) Learning cross-media joint representation with sparse and semisupervised regularization. IEEE Trans Circ Syst Video Technol 24(6):965–978
Article Google Scholar
Zhuang Y, Wang Y, Wu F, Zhang Y, Lu W (2013) Supervised coupled dictionary learning with group structures for multi-modal retrieval. In: AAAI Conference on Artificial Intelligence
Zhang H, Liu Y, Ma Z (2013) Fusing inherent and external knowledge with nonlinear learning for cross-media retrieval. Neurocomputing 119(16):10–16
Article Google Scholar
Zhang H, Cao L, Gao S (2014) A locality correlation preserving support vector machine. Pattern Recogn 47(9):3168–3178
Article MATH Google Scholar
Zhang L, Ma B, Li G, Huang Q, Tian Q (2016) Pl-ranking: A novel ranking method for cross-modal retrieval. In: ACM on Multimedia Conference, pp 1355–1364
Zhang L, Ma B, Li G, Huang Q, Tian Q (2017) Generalized semi-supervised and structured subspace learning for cross-modal retrieval. IEEE Trans Multimed PP(99):1–1
Google Scholar

Download references

Acknowledgements

This work is supported by Natural Science Foundation for Distinguished Young Scholars of Shandong Province (JQ201718), Key Research and Development Foundation of Shandong Province (2016GGX101009), the Natural Science Foundation of China (U1736122, 61603225, 61601268), Shandong Provincial Key Research and Development Plan (2017CXGC1504). And we gratefully acknowledge the support of NVIDIA Corporation with the donation of the TITAN X GPU used for this research.

Author information

Authors and Affiliations

School of Information Science and Engineering, Shandong Normal University, Jinan, 250014, Shandong, China
En Yu, Jiande Sun, Li Wang, Wenbo Wan & Huaxiang Zhang
Institute of Data Science and Technology, Shandong Normal University, Jinan, 250014, Shandong, China
Huaxiang Zhang

Authors

En Yu
View author publications
You can also search for this author in PubMed Google Scholar
Jiande Sun
View author publications
You can also search for this author in PubMed Google Scholar
Li Wang
View author publications
You can also search for this author in PubMed Google Scholar
Wenbo Wan
View author publications
You can also search for this author in PubMed Google Scholar
Huaxiang Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Jiande Sun or Wenbo Wan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yu, E., Sun, J., Wang, L. et al. Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval. Multimed Tools Appl 78, 28931–28951 (2019). https://doi.org/10.1007/s11042-018-5958-9

Download citation

Received: 15 November 2017
Revised: 03 March 2018
Accepted: 02 April 2018
Published: 21 April 2018
Issue Date: October 2019
DOI: https://doi.org/10.1007/s11042-018-5958-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Semi-supervised cross-modal learning for cross modal retrieval and image annotation

Cross-Modal Learning with Images, Texts and Their Semantics

Cross-modal retrieval with dual optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Coupled feature selection based semi-supervised modality-dependent cross-modal retrieval

Abstract

Access this article

Similar content being viewed by others

Semi-supervised cross-modal learning for cross modal retrieval and image annotation

Cross-Modal Learning with Images, Texts and Their Semantics

Cross-modal retrieval with dual optimization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation