ABSTRACT
Online social media repositories such as Flickr and Zooomr allow users to manually annotate their images with freely-chosen tags, which are then used as indexing keywords to facilitate image search and other applications. However, these tags are frequently imprecise and incomplete, though they are provided by human beings, and many of them are almost only meaningful for the image owners (such as the name of a dog). Thus there is still a gap between these tags and the actual content of the images, and this significantly limits tag-based applications, such as search and browsing. To tackle this issue, this paper proposes a social image "retagging" scheme that aims at assigning images with better content descriptors. The refining process, including denoising and enriching, is formulated as an optimization framework based on the consistency between "visual similarity" and "semantic similarity" in social images, that is, the visually similar images tend to have similar semantic descriptors, and vice versa. An effective iterative bound optimization algorithm is applied to learn the improved tag assignment. In addition, as many tags are intrinsically not closely-related to the visual content of the images, we employ knowledge based method to differentiate visual content related tags from unrelated ones and then constrain the tagging vocabulary of our automatic algorithm within the content related tags. Finally, to improve the coverage of the tags, we further enrich the tag set with appropriate synonyms and hypernyms based on an external knowledge base. Experimental results on a Flickr image collection demonstrate the effectiveness of this approach. We will also show the remarkable performance improvements brought by retagging via two applications, i.e., tag-based search and automatic annotation.
- P. Anderson. What is web 2.0? Ideas, technologies and implications for education. JISC Technical Report, 2007.Google Scholar
- M. Lew, N. Sebe, C. Djeraba, and R. Jain. Content-based multimedia information retrieval: State of the art and challenges. TOMCCAP, 2(1):1--19, 2006. Google ScholarDigital Library
- S. Golder and B. Huberman. Usage patterns of collaborative tagging systems. JIS, 32(2):198--208, 2006. Google ScholarDigital Library
- K. Matusiak. Towards user-centered indexing in digital image collections. OCLC Systems and Service, 22(4):283--298, 2006.Google ScholarCross Ref
- J. Li and J. Wang. Real-time computerized annotation of pictures. TPAMI, 30(6):985--1002, 2008. Google ScholarDigital Library
- X.-S. Hua and G. Qi. Online multi-label active annotation: Towards large-scale content-based video search. In MM, pages 141--150, 2008. Google ScholarDigital Library
- C. Fellbaum. Wordnet: An electronic lexical database. Bradford Books, 1998.Google ScholarCross Ref
- D. Liu, X.-S. Hua, L. Yang, M. Wang, and H.-J. Zhang. Tag ranking. In WWW, pages 351--360, 2009. Google ScholarDigital Library
- D. Liu, X.-S. Hua, M. Wang, and H.-J. Zhang. Boost search relevance for tag-based social image retrieval. In ICME, pages 1636--1639, 2009. Google ScholarDigital Library
- Z.-J. Zha, L. Yang, T. Mei, M. Wang, and Z. Wang. Visual query suggestion. In MM, pages 15--24, 2009. Google ScholarDigital Library
- M. Ames and M. Naaman. Why we tag: Motivations for annotation in mobile and online media. In CHI, pages 971--980, 2007. Google ScholarDigital Library
- B. Sigurbj-ornsson and R. Zwol. Flickr tag recommendation based on collective knowledge. In WWW, pages 327--336, 2008. Google ScholarDigital Library
- L. Kennedy, S.-F. Chang, and I. Kozintsev. To search or to label? Predicting the performance of search-based automatic image classifiers. In MIR, pages 249--258, 2006. Google ScholarDigital Library
- R. Yan, A. Natsev, and M. Campbell. A learning-based hybrid tagging and browsing approach for e±cient manual image annotation. In CVPR, pages 1--8, 2008.Google Scholar
- K. Weinberger, M. Slaney, and R. Zwol. Resolving tag ambiguity. In MM, pages 111--120, 2008. Google ScholarDigital Library
- D. Liu, X.-S. Hua, M. Wang, and H.-J. Zhang. Retagging social images based on visual and semnatic consistecy. In WWW, pages 1149--1150, 2010. Google ScholarDigital Library
- D. Liu, M. Wang, J. Yang, X.-S. Hua, and H.-J. Zhang. Tag quality improvement for social images. In ICME, pages 350--353, 2009. Google ScholarDigital Library
- Y. Jin, L. Khan, L. Wang, and M. Awad. Image annotation by combining multiple evidence & wordNet. In MM, pages 706--715, 2005. Google ScholarDigital Library
- C. Wang, F. Jing, L. Zhang, and H.-J. Zhang. Content-based image annotation refinement. In CVPR, pages 1--8, 2007.Google ScholarCross Ref
- B. Dennis. Foragr: Collaboratively tagged photographs and social information visualization. In WWW, 2006.Google Scholar
- Y. Lu, L. Zhang, Q. Tian, and W. Ma. What are the high-level concepts with small semantic gaps? In CVPR, pages 1--8, 2008.Google Scholar
- K. Yanai and K. Barnard. Image region entropy: A measure of visualness of web images associated with one concept. In MM, pages 419--422, 2005. Google ScholarDigital Library
- S. Overell, B. Sigurbj-ornsson, and R. Zwol. Classifying tags uing open content resources. In WSDM, pages 64--73, 2009. Google ScholarDigital Library
- A. Torralba, R. Fergus, and W. Freeman. 80 million tiny images: A large dataset for non-parametric object and scene recognition. TPAMI, 30(11):1958--1970, 2008. Google ScholarDigital Library
- D. Lin. Using syntatic dependency as local context to resolve word sense ambiguity. In ACL, pages 64--71, 1997. Google ScholarDigital Library
- R. Cilibrasi and P. Vitanyi. The google similarity distance. TKDE, 19(3):370--383, 2007. Google ScholarDigital Library
- Y. Liu, R. Jin, and L. Yang. Semi-supervised multi-label learning by constrained non-negative matrix factorization. In AAAI, pages 421--426, 2006. Google ScholarDigital Library
- S.-F. Chang, J. He, Y. Jiang, E. Khoury, C. Ngo, A. Yanagawa, and E. Zavesky. Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In NIST TRECVID Workshop, 2008.Google Scholar
- D. Lee and H. Seung. Algorithms for non-negative matrix factorization. In NIPS, pages 556--562, 2001.Google ScholarDigital Library
Index Terms
- Image retagging
Recommendations
Image tag refinement towards low-rank, content-tag prior and error sparsity
MM '10: Proceedings of the 18th ACM international conference on MultimediaThe vast user-provided image tags on the popular photo sharing websites may greatly facilitate image retrieval and management. However, these tags are often imprecise and/or incomplete, resulting in unsatisfactory performances in tag related ...
Web video retagging
Tags associated with web videos play a crucial role in organizing and accessing large-scale video collections. However, the raw tag list (RawL) is usually incomplete, imprecise and unranked, which reduces the usability of tags. Meanwhile, compared with ...
Enriching and localizing semantic tags in internet videos
MM '11: Proceedings of the 19th ACM international conference on MultimediaTagging of multimedia content is becoming more and more widespread as web 2.0 sites, like Flickr and Facebook for images, YouTube and Vimeo for videos, have popularized tagging functionalities among their users. These user-generated tags are used to ...
Comments