Image Tag Recommendation via Deep Cross-Modal Correlation Mining

Zhang, Xingmeng; Jin, Cheng; Zhang, Yuejie; Zhang, Tao

doi:10.1007/978-3-319-47674-2_36

Image Tag Recommendation via Deep Cross-Modal Correlation Mining

Xingmeng Zhang¹⁸,
Cheng Jin¹⁸,
Yuejie Zhang¹⁸ &
…
Tao Zhang¹⁹

Conference paper
First Online: 10 October 2016

1787 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10035))

Abstract

In this paper, a novel image tag recommendation framework is developed by fusing the deep multimodal feature representation and cross-modal correlation mining, which enables the most appropriate and relevant tags to be presented on the image and facilitates more accurate image retrieval. Such an image tag recommendation pattern can be modeled as an inter-related correlation distribution over deep multimodal visual and semantic representations of images and tags, in which the most important is to create more effective cross-modal correlation and measure what degree they are related. Our experiments on a large number of public data have obtained very positive results.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Murthy, V.N.: Automatic image annotation using deep learning representations. University of Massachusetts, Amherst, MA, USA (2015)
Google Scholar
Wang, W., Arora, R., Livescu, K., et al.: Unsupervised learning of acoustic features via deep canonical correlation analysis. In: 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4590–4594. IEEE (2015)
Google Scholar
Murthy, V.N., Can, E.F., Manmatha, R.: A hybrid model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval. ACM (2014)
Google Scholar
Guillaumin, M., Mensink, T., Verbeek, J., et al.: TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 309–316. IEEE (2009)
Google Scholar
Hardoon, D.R., Szedmak, S., Shawe-Taylor, J.: Canonical correlation analysis: an overview with application to learning methods. Neural Comput. 16(12), 2639–2664 (2004)
Article MATH Google Scholar
Jin, C., Mao, W., Zhang, R., et al.: Cross-modal image clustering via canonical correlation analysis. In: Twenty-Ninth AAAI Conference on Artificial Intelligence (2015)
Google Scholar
Gong, Y., Jia, Y., Leung, T., et al.: Deep convolutional ranking for multilabel image annotation. arXiv preprint arXiv:1312.4894
Makadia, A., Pavlovic, V., Kumar, S.: A new baseline for image annotation. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part III. LNCS, vol. 5304, pp. 316–329. Springer, Heidelberg (2008)
Chapter Google Scholar
Andrew, G., Arora, R., Bilmes, J., et al.: Deep canonical correlation analysis. In: Proceedings of the 30th International Conference on Machine Learning, pp. 1247–1255 (2013)
Google Scholar
Wang, W., Arora, R., Livescu, K., et al.: On deep multi-view representation learning. In: Proceedings of the 32nd International Conference on Machine Learning (ICML-2015), pp. 1083–1092 (2014)
Google Scholar
Sigurbjörnsson, B., Van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th International Conference on World Wide Web, pp. 327–336. ACM (2008)
Google Scholar
Murthy, V.N., Can, E.F., Manmatha, R.A.: A hybrid model for automatic image annotation. In: Proceedings of International Conference on Multimedia Retrieval. ACM (2014)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Pennington, J., Socher, R., Manning, C.: GloVe: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing (2014)
Google Scholar
Thompson, B.: Canonical correlation analysis. In: Encyclopedia of statistics in behavioral science (2005)
Google Scholar

Download references

Acknowledgments

This work is supported by the National Key Research and Development Plan (Grant No. 2016YFC0801003). Yuejie Zhang is the corresponding author.

Author information

Authors and Affiliations

School of Computer Science, Shanghai Key Laboratory of Intelligent Information Processing, Fudan University, Shanghai, 200433, People’s Republic of China
Xingmeng Zhang, Cheng Jin & Yuejie Zhang
School of Information Management and Engineering, Shanghai University of Finance and Economics, Shanghai, 200433, People’s Republic of China
Tao Zhang

Authors

Xingmeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Cheng Jin
View author publications
You can also search for this author in PubMed Google Scholar
Yuejie Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Tao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuejie Zhang .

Editor information

Editors and Affiliations

Tsinghua University , Beijing, China
Maosong Sun
Fudan University , Shanghai, China
Xuanjing Huang
Dalian University of Technology , Dalian, China
Hongfei Lin
Tsinghua University , Beijing, China
Zhiyuan Liu
Tsinghua University , Beijing, China
Yang Liu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, X., Jin, C., Zhang, Y., Zhang, T. (2016). Image Tag Recommendation via Deep Cross-Modal Correlation Mining. In: Sun, M., Huang, X., Lin, H., Liu, Z., Liu, Y. (eds) Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data. NLP-NABD CCL 2016 2016. Lecture Notes in Computer Science(), vol 10035. Springer, Cham. https://doi.org/10.1007/978-3-319-47674-2_36

Download citation

DOI: https://doi.org/10.1007/978-3-319-47674-2_36
Published: 10 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47673-5
Online ISBN: 978-3-319-47674-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics