Bayesian Framework for Automatic Image Annotation Using Visual Keywords

Agrawal, Rajeev; Wu, Changhua; Grosky, William; Fotouhi, Farshad

doi:10.1007/978-3-642-13467-8_14

Rajeev Agrawal⁶,
Changhua Wu⁷,
William Grosky⁸ &
…
Farshad Fotouhi⁹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 75))

Included in the following conference series:

International Conference on Ubiquitous Computing and Multimedia Applications

516 Accesses

Abstract

In this paper, we propose a Bayesian probability based framework, which uses visual keywords and already available text keywords to automatically annotate the images. Taking the cue from document classification, an image can be considered as a document and objects present in it as words. Using this concept, we can create visual keywords by dividing an image into tiles based on a certain template size. Visual keywords are simple vector quantization of small-sized image tiles. We estimate the conditional probability of a text keyword in the presence of visual keywords, described by a multivariate Gaussian distribution. We demonstrate the effectiveness of our approach by comparing predicted text annotations with manual annotations and analyze the effect of text annotation length on the performance.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Image Annotation Using a Semantic Hierarchy

Automatic Images Annotation Extension Using a Probabilistic Graphical Model

A Content-Based Visual Information Retrieval Approach for Automated Image Annotation

References

Duygulu, P., Barnard, K., Freitas, J., de Forsyth, D.A.: Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Chapter Google Scholar
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proceedings of the International Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Google Scholar
Ghoshal, A., Ircing, P., Khudanpur, S.: Hidden Markov models for automatic annotation and content-based retrieval of images and video. In: Proceedings of the ACM SIGIR Conference on Research and Development in information Retrieval, Salvador, Brazil, August 15 - 19, pp. 544–551 (2005)
Google Scholar
Agrawal, R., Grosky, W., Fotouhi, F., Wu, C.: Application of Diffusion Kernel in Multimodal Image Retrieval. In: Proceedings of the Ninth IEEE International Symposium on Multimedia Workshops, December 10-12 (2007)
Google Scholar
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1002–1009 (2004)
Google Scholar
Picard, R.W.: Toward a Visual Thesaurus. In: Workshop in computing. In: MIRO 1995, pp. 35–48 (1995)
Google Scholar
Lim, J.H.: Building Visual Vocabulary for Image Indexation and query Formulation. Pattern Analysis and Applications 4(2-3), 125–139 (2001)
Article MATH Google Scholar
Jain, A.K., Vailaya, A.: Image Retrieval Using Color and Shape. Pattern Recognition 29(8), 1233–1244 (1996)
Article Google Scholar
Carson, C., Belongie, S., Greenspan, H., Malik, J.: Blobworld: image segmentation using expectation-maximization and its application to image querying. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(8), 1026–1038 (2002)
Article Google Scholar
Zhu, L., Rao, A.B., Zhang, A.: Theory of keyblock-based image retrieval. ACM Trans. Inf. Syst. 20(2), 224–257 (2002)
Article Google Scholar
Csurka, G., Dance, C., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: ECCV Workshop on Statistical Learning in Computer Vision (2004)
Google Scholar
Sivic, J., Russell, B.C., Efros, A.A., Zisserman, A., Freeman, W.T.: Discovering Objects and Their Location in Images. In: IEEE International Conference on Computer Vision, vol. 1, pp. 370–377 (2005)
Google Scholar
Maree, R., Geurts, P., Piater, J., Wehenkel, L.: Random Subwindows for Robust Image Classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 20 - 26, vol. 1 (2005)
Google Scholar
Fan, J., Gao, Y., Luo, H., Xu, G.: Automatic image annotation by using concept-sensitive salient objects for image content representation. In: Proceedings of the 27th Annual international ACM SIGIR Conference on Research and Development in information Retrieval, SIGIR 2004, Sheffield, United Kingdom, July 25 - 29, pp. 361–368. ACM, New York (2004)
Google Scholar
Yang, C., Dong, M., Fotouhi, F.: Image content annotation using Bayesian framework and complement components analysis. In: IEEE International Conference on Image Processing, September 2005, vol. 1, pp. 1193–1196 (2005)
Google Scholar
Zhou, X., Wang, M., Zhang, Q., Zhang, J., Shi, B.: Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the 6th ACM International Conference on Image and Video Retrieval, Amsterdam, The Netherlands, July 09 - 11 (2007)
Google Scholar
Wang, M., Zhou, X., Chua, T.: Automatic image annotation via local multi-label classification. In: Proceedings of the 2008 international Conference on Content-Based Image and Video Retrieval, Niagara Falls, Canada, July 07 - 09 (2008)
Google Scholar
Li, J., Wang, J.Z.: Real-Time Computerized Annotation of Pictures. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(6), 985–1002 (2008)
Article Google Scholar
http://www.chiariglione.org/MPEG/standards/mpeg-7/mpeg-7.htm
MPEG-7: Visual experimentation model (xm) version 10.0. ISO/IEC/JTC1/SC29/WG11, Doc. N4062 (2001)
Google Scholar
Lowe, D.G.: Object Recognition from Local Scale Invariant Features. In: Int. Conf. on Comp. Vis., vol. 2, pp. 1150–1157 (1999)
Google Scholar
Manjunath, B.S., Salembier, P., Sikor, T. (eds.): Introduction to MPEG-7 Multimedia Content Description Interface. John Wiley & Sons, Indianapolis (2002)
Google Scholar
Agrawal, R., Grosky, W.I., Fotouhi, F.: Searching an Appropriate Template Size for Multimodal Image Clustering. In: International Conference on Multimedia Computing and Systems, ICMCS 2009, Ouarzazate, Morocco, April 02-04 (2009) (accepted)
Google Scholar
Karypis, G.: Cluto: A clustering toolkit, release 2.1.1. Technical Report 02-017, University of Minnesota, Department of Computer Science (2003)
Google Scholar
van Rijsbergen, C.J., Robertson, S.E., Porter, M.F.: New models in probabilistic information retrieval. British Library Research and Development Report (1980)
Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A Database and Web-Based Tool for Image Annotation. Int. J. Comput. Vision 77, 157–173 (2008)
Article Google Scholar
Amato, G., Gennaro, C., Savino, P., Rabitti, F.: Milos: a Multimedia Content Management System for Digital Library Applications. In: Heery, R., Lyon, L. (eds.) ECDL 2004. LNCS, vol. 3232, pp. 14–25. Springer, Heidelberg (2004)
Google Scholar
Miller, G.A.: WordNet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Grand Valley State University, 1 Campus Drive, Allendale, MI, 49401
Rajeev Agrawal
Kettering University, 1700 West Third Av, Flint, MI, 48504
Changhua Wu
The University of Michigan, 4901 Evergreen Road, Dearborn, MI, 48128
William Grosky
Wayne State University, 431 State Hall, Detroit, MI, 48202
Farshad Fotouhi

Authors

Rajeev Agrawal
View author publications
You can also search for this author in PubMed Google Scholar
Changhua Wu
View author publications
You can also search for this author in PubMed Google Scholar
William Grosky
View author publications
You can also search for this author in PubMed Google Scholar
Farshad Fotouhi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

VITM, Indore, India
G. S. Tomar
Department of Computer and Information Science, University of Michigan – Dearborn, 4901 Evergreen Road, 48128, Dearborn, MI, USA
William I. Grosky
Hannam University, 306-791, Daejeon, South Korea
Tai-hoon Kim
Department of Computer Science, Lakehead University, P7B 5E1, Thunder Bay, Ontario, Canada
Sabah Mohammed
Computer Science and Engineering Department, Jadavpur University, Kolkata, India
Sanjoy Kumar Saha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Agrawal, R., Wu, C., Grosky, W., Fotouhi, F. (2010). Bayesian Framework for Automatic Image Annotation Using Visual Keywords. In: Tomar, G.S., Grosky, W.I., Kim, Th., Mohammed, S., Saha, S.K. (eds) Ubiquitous Computing and Multimedia Applications. UCMA 2010. Communications in Computer and Information Science, vol 75. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13467-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-13467-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13466-1
Online ISBN: 978-3-642-13467-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics