Abstract
In image annotation, the annotation words are expected to represent image content at both visual level and semantic level. However, a single word sometimes is ambiguous in annotation, for example, ”apple” may refer to a fruit or a company. However, when ”apple” combines with ”phone” or ”fruit”, it will be more semantically and visually consistent. In this paper, we attempt to find this kind of combination and construct a less ambiguous phrase-level lexicon for annotation. First, concept-based image search is conducted to obtain a semantically consistent image set (SC-IS). Then, a hierarchical clustering algorithm is adopted to visually cluster the images in SC-IS to obtain a semantically and visually specific image set (SVC-IS). Finally, we apply a frequent itemset mining in SVC-IS to construct the phrase-level lexicon and associate the lexicon into a probabilistic annotation framework to estimate annotation words of any untagged images. Our experimental results show that the discovered phrase-level lexicon is able to improve the annotation performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Cusano, C., Ciocca, G., Schettini, R.: Image annotation using SVM. In: Proceedings of Internet imaging IV, SPIE, vol. 5304, pp. 330–338 (2004) (Citeseer)
Lavrenko, V., Manmatha, R., Jeon, J.: A model for learning the semantics of pictures (2003) (Citeseer)
Wang, X., Zhang, L., Li, X., Ma, W.: Annotating images by mining image search results. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(11), 1919–1932 (2008)
Lu, Y., Zhang, L., Tian, Q., Ma, W.: What are the high-level concepts with small semantic gaps? In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8 (2008)
Sun, A., Bhowmick, S.: Image tag clarity: in search of visual-representative tags for social images. In: Proceedings of the first SIGMM workshop on Social media, pp. 19–26. ACM, New York (2009)
Weinberger, K., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: Proceeding of the 16th ACM international conference on Multimedia, pp. 111–120. ACM, New York (2008)
Wang, C., Jing, F., Zhang, L., Zhang, H.: Content-based image annotation refinement. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2007, pp. 1–8 (2007)
Li, J., Wang, J.: Automatic linguistic indexing of pictures by a statistical modeling approach. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(9), 1075–1088 (2003)
Blei, D., Jordan, M.: Modeling annotated data. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, pp. 127–134. ACM, New York (2003)
Jeon, J., Lavrenko, V., Manmatha, R.: Automatic image annotation and retrieval using cross-media relevance models. In: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, p. 126. ACM, New York (2003)
Feng, S., Manmatha, R., Lavrenko, V.: Multiple bernoulli relevance models for image and video annotation. In: Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004, vol. 2 (2004)
Li, X., Chen, L., Zhang, L., Lin, F., Ma, W.: Image annotation by large-scale content-based image retrieval. In: Proceedings of the 14th annual ACM international conference on Multimedia, p. 610. ACM, New York (2006)
Wang, X., Zhang, L., Jing, F., Ma, W.: Annosearch: Image auto-annotation by search. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2 (2006)
Jin, Y., Khan, L., Wang, L., Awad, M.: Image annotations by combining multiple evidence & wordNet. In: Proceedings of the 13th annual ACM international conference on Multimedia, pp. 706–715. ACM, New York (2005)
Wang, C., Jing, F., Zhang, L., Zhang, H.: Image annotation refinement using random walk with restarts. In: Proceedings of the 14th annual ACM international conference on Multimedia, p. 650. ACM, New York (2006)
Wang, Y., Gong, S.: Refining image annotation using contextual relations between words. In: Proceedings of the 6th ACM international conference on Image and video retrieval, p. 432. ACM, New York (2007)
Jia, J., Yu, N., Rui, X., Li, M.: Multi-graph similarity reinforcement for image annotation refinement. In: 15th IEEE International Conference on Image Processing, ICIP 2008, pp. 993–996 (2008)
Liu, D., Hua, X., Yang, L., Wang, M., Zhang, H.: Tag ranking. In: Proceedings of the 18th international conference on World wide web, pp. 351–360. ACM, New York (2009)
Xu, D., Chang, S.: Video event recognition using kernel methods with multilevel temporal alignment. IEEE Transactions on Pattern Analysis and Machine Intelligence 30(11), 1985–1997 (2008)
Han, J., Kamber, M.: Data mining: concepts and techniques. Morgan Kaufmann, San Francisco (2006)
Duygulu, P., Barnard, K., De Freitas, J., Forsyth, D.: Object recognition as machine translation: Learning a lexicon for a fixed image vocabulary. In: Heyden, A., Sparr, G., Nielsen, M., Johansen, P. (eds.) ECCV 2002. LNCS, vol. 2353, pp. 349–354. Springer, Heidelberg (2002)
Liu, J., Wang, B., Lu, H., Ma, S.: A graph-based image annotation framework. Pattern Recognition Letters 29(4), 407–415 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Yu, L., Liu, J., Xu, C. (2010). Discovering Phrase-Level Lexicon for Image Annotation. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_17
Download citation
DOI: https://doi.org/10.1007/978-3-642-15702-8_17
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15701-1
Online ISBN: 978-3-642-15702-8
eBook Packages: Computer ScienceComputer Science (R0)