Skip to main content

Two-Probabilistic Latent Semantic Model for Image Annotation and Retrieval

  • Conference paper
Computer Vision – ACCV 2010 Workshops (ACCV 2010)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6468))

Included in the following conference series:

  • 1159 Accesses

Abstract

A novel latent variable modeling technique for image annotation and retrieval is proposed. This model is useful for annotating the images with relevant semantic meanings as well as for retrieving images which satisfy the users query with specific text or image. The framework of two-step latent variable is proposed to support multi-functionality of the retrieval and annotation system. Furthermore, the existing and the proposed image annotation models are compared in terms of their annotating performance. Images from standard databases are used in the comparison in order to identify the best model for automatic image annotation, using precision-recall measurement. Local features, or visual words, of each image in the database are extracted using Scale-Invariant Feature Transform (SIFT) and clustering techniques. Each image is then represented by Bag-of-Features (BoF) which is a histogram of visual words. Semantic meanings can then be related to each BoF using latent variable for annotation purposes. Subsequently, for image retrieval, each image query is also related to semantic meanings. Finally, image retrieval results are obtained by matching semantic meanings of the query with those of the images in the database using a second latent variable.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Russell, B.C., Torralba, A.: LabelMe: a database and web-based tool for image annotation. Intl. J. Computer Vision 77, 157–173 (2008)

    Article  Google Scholar 

  2. Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning 41(2), 177–196 (2001)

    Article  MATH  Google Scholar 

  3. Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T.: A Thousand Words in a Scene. IEEE Trans. Pattern Analysis and Machine Intelligence 29(9), 1575–1589 (2007)

    Article  Google Scholar 

  4. Monay, F., Gatica-Perez, D.: Modeling Semantic Aspects for Cross-Media Image Indexing. IEEE Trans. Pattern Analysis and Machine Intelligence 29(10), 1802–1817 (2007)

    Article  Google Scholar 

  5. Blei, D., Jordan, M.: Modeling Annotated Data. In: Proc. Intl. Conf. Research and Development in Information Retrieval (2003)

    Google Scholar 

  6. Fei-Fei, L., Perona, P.: A Bayesian Hierarchical Model for learning Natural Scene Categories. In: Intl. IEEE Conf. Computer Vision and Pattern Recogntion, vol. 2, pp. 20–25 (2005)

    Google Scholar 

  7. Weber, M., Welling, M., Perona, P.: Unsupervised Learning of Models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  8. Duygula, P., Barnard, K., De Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  9. Zhang, R., Zhang, Z., Li, M., Ma, W.-Y., Zhang, H.-J.: A probabilistic semantic model for image annotation and multi-model image retrieval. Multimedia Systems 12, 27–33 (2006)

    Article  Google Scholar 

  10. Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proc. Intl. Conf. Research and Development in Information Retrieval, SIGIR (2003)

    Google Scholar 

  11. Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proc. of Advances in Neural Information Processing Systems (2003)

    Google Scholar 

  12. Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Intl. Conf. Computer Vision and Recognition, vol. 2, pp. II–2002–II–1009 (2004)

    Google Scholar 

  13. Huang, P., Bu, J., Chen, C., Liu, K., Qiu, G.: Improve Image Annotation by combining Multiple Models. In: Intl. IEEE Conf. Signal-Image Technologies and Internet-based system (2008)

    Google Scholar 

  14. Pham, T.-T., Maillot, N.E., Lim, J.-H., Chevallet, J.-P.: Latent Semantic Fusion Model for Image Retrieval and Annotation. In: Proc. ACM Conf. Information and Knowledge Management, pp. 439–444 (2007)

    Google Scholar 

  15. Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proc. of Intl. Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)

    Google Scholar 

  16. Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: MIR 2006. ACM Press, New York (2006)

    Google Scholar 

  17. Everingham, M., Van-Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Classes Challenge, VOC 2008 Results (2008)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Watcharapinchai, N., Aramvith, S., Siddhichai, S. (2011). Two-Probabilistic Latent Semantic Model for Image Annotation and Retrieval. In: Koch, R., Huang, F. (eds) Computer Vision – ACCV 2010 Workshops. ACCV 2010. Lecture Notes in Computer Science, vol 6468. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22822-3_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22822-3_36

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22821-6

  • Online ISBN: 978-3-642-22822-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics