A two-level model for automatic image annotation

Ke, Xiao; Li, Shaozi; Cao, Donglin

doi:10.1007/s11042-010-0706-9

A two-level model for automatic image annotation

Published: 11 January 2011

Volume 61, pages 195–212, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiao Ke¹,
Shaozi Li¹ &
Donglin Cao¹

337 Accesses
7 Citations
Explore all metrics

Abstract

Image automatic annotation is a significant and challenging problem in pattern recognition and computer vision. Current image annotation models almost used all the training images to estimate joint generation probabilities between images and keywords, which would inevitably bring a lot of irrelevant images. To solve the above problem, we propose a hierarchical image annotation model which combines advantages of discriminative model and generative model. In first annotation layer, discriminative model is used to assign topic annotations to unlabeled images, and then relevant image set corresponding to each unlabeled image is obtained. In second annotation layer, we propose a keywords-oriented method to establish links between images and keywords, and then our iterative algorithm is used to expand relevant image sets. Candidate labels will be given higher weights by using our method based on visual keywords. Finally, generative model is used to assign detailed annotations to unlabeled images on expanded relevant image sets. Experiments conducted on Corel 5K datasets verify the effectiveness of our hierarchical image annotation model.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Article 11 April 2015

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Notes

http://www.flickr.com

References

Andriluka M, Roth S, Schiele B (2008) People-tracking-by-detection and people-detection-by-tracking. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Alaska, America, 1–8
Blei D, Ng AY, Jordan M (2003) Latent Dirichlet allocation. J Mach Learn Res 3:993–1022
MATH Google Scholar
Chang CC, Lin CJ (2010) LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm
Dollar P, Wojek C, Schiele B, Perona P (2009) Pedestrian detection: a benchmark. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Florida, America, 304–311
Duygulu P, Barnard K, Freitas J, Forsyth D (2002) Object recognition as machine translation: learning a lexicon for a fixed image vocabulary, Proceedings of the 7th European Conference on Computer Vision, Copenhagen, Denmark, pp. 97–112
Feng SL, Manmatha R, Lavrenko V (2004) Multiple Bernoulli relevance models for image and video annotation, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Washington D.C., USA, 1002–1009
Gustavo C, Antoni BC, Pedro JM, Nuno V (2007) Supervised learning of semantic classes for image annotation and retrieval. IEEE Trans Pattern Anal Mach Intell 29(3):394–410
Article Google Scholar
Jeon J, Lavrenko V, Manmatha R (2003) Automatic image annotation and retrieval using cross-media relevance models, Proceedings of the 26th Annual International ACM SIGIR, Toronto, Canada, pp. 119–126
Jim M, David GL (2008) Object class recognition and localization using sparse features with limited receptive fields. Int J Comput Vis 80(1):45–57
Article Google Scholar
Kalanit G, Rory S (2009) Object recognition: insights from advances in fMRI methods. Curr Dir Psychol Sci 17(2):73–79
Google Scholar
Kang F, Jin R, Sukthankar R (2006) Correlated label propagation with application to multi-label learning. Proceedings of the 2006 IEEE Computer Society conference on Computer Vision and Pattern Recognition, New York, USA, 1719–1726
Lavrenko V, Manmatha R, Jeon J (2003) A model for learning the semantics of pictures, Proceedings of Advance in Neutral Information Processing, Vancouver/Whistler, Canada
Liu J, Li MJ, Ma WY, Liu QS, Lu HQ (2006) An adaptive graph model for automatic image annotation, Proceedings of the ACM SIGMM Workshop on Multimedia Information Retrieval, Santa Barbara, USA, 61–69
Sabine B, Salvatore T (2009) Modeling, classifying and annotating weakly annotated images using bayesian network. Proceedings of the 2009 10th International Conference on Document Analysis and Recognition, Barcelona, 1201–1205
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905
Article Google Scholar
Stefanie L, Roland M, Robert S, Viktoria P, Georg T (2009) Automatic image annotation using visual content and folksonomies. Multimedia Tools and Applications 42:97–113
Article Google Scholar
Xiaojun Q, Yutao H (2007) Incorporating multiple SVMs for automatic image annotation. Pattern Recognit 40:728–741
Article MATH Google Scholar
Yong W, Tao M, Shaogang G, Xian-Sheng H (2009) Combining global, regional and contextual features for automatic image annotation. Pattern recogn 42(2):259–266
Article MATH Google Scholar
Yufeng Z, Yao Z, Zhenfeng Z (2009) TSVM-HMM: transductive SVM based hidden Markov model for automatic image annotation. Expert Syst Appl 36:9813–9818
Article Google Scholar

Download references

Acknowledgment

This work is partially supported by National Natural Science Foundation of China (No. 60873179, No. 60803078), Research Fund for the Doctoral Program of Higher Education of China (No. 20090121110032) and Shenzhen Municipal Science and Technology Planning Program for Basic Research of China (No. JC200903180630A). The authors also gratefully acknowledge the helpful comments and suggestions of the reviewers, which have improved the paper.

Author information

Authors and Affiliations

Department of Cognitive Science, Fujian key laboratory of the Brain-like Intelligent Systems, Xiamen University, Xiamen, China
Xiao Ke, Shaozi Li & Donglin Cao

Authors

Xiao Ke
View author publications
You can also search for this author in PubMed Google Scholar
Shaozi Li
View author publications
You can also search for this author in PubMed Google Scholar
Donglin Cao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Ke.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ke, X., Li, S. & Cao, D. A two-level model for automatic image annotation. Multimed Tools Appl 61, 195–212 (2012). https://doi.org/10.1007/s11042-010-0706-9

Download citation

Published: 11 January 2011
Issue Date: November 2012
DOI: https://doi.org/10.1007/s11042-010-0706-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two-level model for automatic image annotation

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Learning to Prompt for Vision-Language Models

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A two-level model for automatic image annotation

Abstract

Access this article

Similar content being viewed by others

Microsoft COCO: Common Objects in Context

ImageNet Large Scale Visual Recognition Challenge

Learning to Prompt for Vision-Language Models

Notes

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation