Abstract
A novel latent variable modeling technique for image annotation and retrieval is proposed. This model is useful for annotating the images with relevant semantic meanings as well as for retrieving images which satisfy the users query with specific text or image. The framework of two-step latent variable is proposed to support multi-functionality of the retrieval and annotation system. Furthermore, the existing and the proposed image annotation models are compared in terms of their annotating performance. Images from standard databases are used in the comparison in order to identify the best model for automatic image annotation, using precision-recall measurement. Local features, or visual words, of each image in the database are extracted using Scale-Invariant Feature Transform (SIFT) and clustering techniques. Each image is then represented by Bag-of-Features (BoF) which is a histogram of visual words. Semantic meanings can then be related to each BoF using latent variable for annotation purposes. Subsequently, for image retrieval, each image query is also related to semantic meanings. Finally, image retrieval results are obtained by matching semantic meanings of the query with those of the images in the database using a second latent variable.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Russell, B.C., Torralba, A.: LabelMe: a database and web-based tool for image annotation. Intl. J. Computer Vision 77, 157–173 (2008)
Hofmann, T.: Unsupervised Learning by Probabilistic Latent Semantic Analysis. Machine Learning 41(2), 177–196 (2001)
Quelhas, P., Monay, F., Odobez, J.-M., Gatica-Perez, D., Tuytelaars, T.: A Thousand Words in a Scene. IEEE Trans. Pattern Analysis and Machine Intelligence 29(9), 1575–1589 (2007)
Monay, F., Gatica-Perez, D.: Modeling Semantic Aspects for Cross-Media Image Indexing. IEEE Trans. Pattern Analysis and Machine Intelligence 29(10), 1802–1817 (2007)
Blei, D., Jordan, M.: Modeling Annotated Data. In: Proc. Intl. Conf. Research and Development in Information Retrieval (2003)
Fei-Fei, L., Perona, P.: A Bayesian Hierarchical Model for learning Natural Scene Categories. In: Intl. IEEE Conf. Computer Vision and Pattern Recogntion, vol. 2, pp. 20–25 (2005)
Weber, M., Welling, M., Perona, P.: Unsupervised Learning of Models for recognition. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 18–32. Springer, Heidelberg (2000)
Duygula, P., Barnard, K., De Freitas, N., Forsyth, D.: Object Recognition as Machine Translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 97–112. Springer, Heidelberg (2002)
Zhang, R., Zhang, Z., Li, M., Ma, W.-Y., Zhang, H.-J.: A probabilistic semantic model for image annotation and multi-model image retrieval. Multimedia Systems 12, 27–33 (2006)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic Image Annotation and Retrieval using Cross-Media Relevance Models. In: Proc. Intl. Conf. Research and Development in Information Retrieval, SIGIR (2003)
Lavrenko, V., Manmatha, R., Jeon, J.: A Model for Learning the Semantics of Pictures. In: Proc. of Advances in Neural Information Processing Systems (2003)
Feng, S.L., Manmatha, R., Lavrenko, V.: Multiple Bernoulli Relevance Models for Image and Video Annotation. In: Intl. Conf. Computer Vision and Recognition, vol. 2, pp. II–2002–II–1009 (2004)
Huang, P., Bu, J., Chen, C., Liu, K., Qiu, G.: Improve Image Annotation by combining Multiple Models. In: Intl. IEEE Conf. Signal-Image Technologies and Internet-based system (2008)
Pham, T.-T., Maillot, N.E., Lim, J.-H., Chevallet, J.-P.: Latent Semantic Fusion Model for Image Retrieval and Annotation. In: Proc. ACM Conf. Information and Knowledge Management, pp. 439–444 (2007)
Mori, Y., Takahashi, H., Oka, R.: Image-to-word transformation based on dividing and vector quantizing images with words. In: Proc. of Intl. Workshop on Multimedia Intelligent Storage and Retrieval Management (1999)
Smeaton, A.F., Over, P., Kraaij, W.: Evaluation campaigns and TRECVid. In: MIR 2006. ACM Press, New York (2006)
Everingham, M., Van-Gool, L., Williams, C.K.I., Winn, J., Zisserman, A.: The PASCAL Visual Classes Challenge, VOC 2008 Results (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Watcharapinchai, N., Aramvith, S., Siddhichai, S. (2011). Two-Probabilistic Latent Semantic Model for Image Annotation and Retrieval. In: Koch, R., Huang, F. (eds) Computer Vision – ACCV 2010 Workshops. ACCV 2010. Lecture Notes in Computer Science, vol 6468. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22822-3_36
Download citation
DOI: https://doi.org/10.1007/978-3-642-22822-3_36
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22821-6
Online ISBN: 978-3-642-22822-3
eBook Packages: Computer ScienceComputer Science (R0)