Improving image tags by exploiting web search results

Zhang, Xiaoming; Li, Zhoujun; Chao, Wenhan

doi:10.1007/s11042-011-0863-5

Improving image tags by exploiting web search results

Published: 27 August 2011

Volume 62, pages 601–631, (2013)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Xiaoming Zhang^1,2,3,
Zhoujun Li^1,2,3 &
Wenhan Chao^1,2,3

412 Accesses
7 Citations
Explore all metrics

Abstract

Automatic image tagging automatically assigns image with semantic keywords called tags, which significantly facilitates image search and organization. Most of present image tagging approaches are constrained by the training model learned from the training dataset, and moreover they have no exploitation on other type of web resource (e.g., web text documents). In this paper, we proposed a search based image tagging algorithm (CTSTag), in which the result tags are derived from web search result. Specifically, it assigns the query image with a more comprehensive tag set derived from both web images and web text documents. First, a content-based image search technology is used to retrieve a set of visually similar images which are ranked by the semantic consistency values. Then, a set of relevant tags are derived from these top ranked images as the initial tag set. Second, a text-based search is used to retrieve other relevant web resources by using the initial tag set as the query. After the denoising process, the initial tag set is expanded with other tags mined from the text-based search result. Then, an probability flow measure method is proposed to estimate the probabilities of the expanded tags. Finally, all the tags are refined using the Random Walk with Restart (RWR) method and the top ones are assigned to the query images. Experiments on NUS-WIDE dataset show not only the performance of the proposed algorithm but also the advantage of image retrieval and organization based on the result tags.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning to Prompt for Vision-Language Models

Article 31 July 2022

Image Generation: A Review

Article 11 March 2022

SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers

Article Open access 27 October 2023

References

Bailloeul T, Zhu CZ, Xu YH (2008) Automatic image tagging as a random walk with priors on the canonical correlation subspace. In: Proceeding of 9th ACM international conference on multimedia information retrieval, pp 75–82
Barnard K, Duygulu P, Forsyth D, de Freitas N, Blei DM, Jordan MI (2003) Matching words and pictures. J Mach Learn Res 3(6):1107–1135
MATH Google Scholar
Bruza PD, Song D (2002) Inferring query models by computing information flow. In: Proceedings of CIKM 2002, pp 260–269
Cao G, Nie J, Gao J, Robertson S (2008) Selecting good expansion terms for pseudo-relevance feedback. In: Proceedings of the 31st ACM SIGIR conference on research and development in information retrieval. Singapore, pp 243–250
Cao L, Pozo AD, Jin X, Luo J, Han J (2010) RankCompete: simultaneous ranking and clustering of web photos. In: Proceedings of the 19th international conference on World Wide Web
Chang SF, He J, Jiang YG, El Khoury E, Ngo CW, Yanagawa A, Zavesky E (2008) Columbia University/VIREO-CityU/IRIT TRECVID2008 high-level feature extraction and interactive video search. In: Proceedings of TRECVID 2008
Chen XY, Mu YD, Yan SC, Chua TS (2010) Efficient large-scale image annotation by probabilistic collaborative multi-label propagation. In: Proceedings of 18th annual ACM international conference on multimedia, pp 35–44
Chua T-S, Tang J, Hong R, Li H, Luo Z, Zheng Y-T (2009) NUS-WIDE: a real-world web image database from national University of Singapore. In: ACM international conference on image and video retrieval. Greece, 8–10 Jul 2009
Croft WB, Lafferty J (2002) Language models for information retrieval. Kluwer int. series on information retrieval, vol 13. Kluwer Academic Publishers
Geng B, Yang L, Xu C, Hua X (2008) Collaborative learning for image and video annotation. In: Proceeding of the 1st ACM international conference on multimedia information retrieval, pp 443–450
Han J, Kamber M (2001) Data mining: concepts and techniques. Morgan Kaufmann
Heesch D, Yavlinsky A, Ruger S (2006) Nnk: networks and automated annotation for browsing large image collections from the World Wide Web. In: Proceedings of the 14th ACM International Conference on Multimedia, pp 493–494
Hong R, Wang M, Xu M, Yan S, Chua T-S (2010) Dynamic caption: video accessibility enhancement for hearing impairment. In: ACM international conference on multimedia (ACM MM)
Naphade M, Smith JR, Tesic J, Chang S-F, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. IEEE Multimed 13(3):86–91
Article Google Scholar
Jing F, Wang C, Yao Y, Deng K, Zhang L, Ma W (2006) IGroup: web image search results clustering. In: Proceedings of the 14th annual ACM international conference on multimedia, pp 377–384
Jones KS, Walker S, Robertson SE (2000) A probabilistic model of information retrieval: development and comparative experiments—part 2. Journal of Information Processing and Management 36(6):809–840
Article Google Scholar
Lei W, Linjun Y, Nenghai Y, Hua XS (2009) Learning to tag. In: Proceedings of the 18th ACM international conference on World Wide Web, pp 20–24
Lei W, Steven CH, Rong Jin H, Jianke Z, Nenghai Y (2009) Distance Metric Learning from Uncertain Side Information with Application to Automated Photo Tagging. In: Proceeding of 17th ACM international conference on multimedia, pp 15–24
Li J, Wang JZ (2006) Real-time computerized annotation of pictures. In: Proceedings of the 14th annual ACM international conference on multimedia, pp 911–920
Li X, Snoek CGM (2009) Visual categorization with negative examples for free. In: Proceedings of the 17th international conference on multimedia, pp 661–664
Li X, Snoek CG, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimed 11(7):1310–1322
Article Google Scholar
Li X-R, Snoek CG, Worring M (2009) Annotating images by harnessing worldwide user-tagged photos. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing, pp 3717–3720
Liu J, Wang B, Li M, Li Z, Ma W, Lu H, Ma S (2007) Dual cross-media relevance model for image annotation. In: Proceedings of the 15th international conference on multimedia, pp 605–614
Liu Y, Zhang D, Lu G, Ma WY (2007) A survey of content-based image retrieval with high-level semantics. Pattern Recogn 40(1):262–282
Article MATH Google Scholar
Liu D, Wang M, Hua XS, Zhang HJ (2009) Tag ranking. In: Proceeding of the 18th ACM international conference on World Wide Web, pp 351–340
Lu Y, Zhang L, Tian Q, Ma W-Y (2008) What are the high-level concepts with small semantic gaps? In: Proceeding of IEEE 21th conference on computer vision and pattern recognition, pp 1–8
Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citationranking: bringing order to theWeb, technical report. Stanford University, Stanford
Google Scholar
Russell BC, Torralba A, Murphy KP, Freeman WT (2008) LabelMe: a database and web-based tool for image annotation. Int J Comput Vis 77(1):157–173
Article Google Scholar
Setz AT, Snoek CGM (2009) Can social tagged images aid concept-based video search? In: Proceedings of ICME, pp 1460–1463
Shen Y, Fan JP (2010) Leveraging loosely-tagged images and inter-object correlations for tag recommendation. In: Proceedings of 18th annual ACM international conference on multimedia, pp 5–14
Siersdorfer S, San Pedro J, Sanderson M (2009) Automatic video tagging using content redundancy. In: Proceeding of the 32nd ACM international conference on research and development in information retrieval, pp 16–23
Tong H, Faloutsos C, Pan J (2006) Fast random walk with restart and its applications. In: Proceedings of the IEEE 6th international conference on data mining, pp 613–622
Tsikrika T, Diou C, de Vries AP, Delopoulos A (2010) Reliability and effectiveness of clickthrough data for automatic image annotation. Multimed Tools Appl 55(1):27–52
Article Google Scholar
Turtle HR, Croft WB (1992) A comparison of text retrieval models. Comput J 35(3):279–298
Article MATH Google Scholar
Vassilieva NS (2009) Content-based image retrieval methods. Program Comput Softw 35(3):158–180
Article MathSciNet Google Scholar
Wang C, Jing F, Zhang L, Zhang H-J (2006) Image annotation refinement using random walk with restarts. In: Proceedings of 14th ACM international conference on multimedia, pp 647–650
Wang X, Zhang L, Jing F, Ma W (2006) AnnoSearch: image auto-annotation by search. In: Proceedings of the 19th IEEE computer society conference on computer vision and pattern recognition, vol 2, pp 1483–1490
Wang C, Jing F, Zhang L, Zhang HJ (2008) Scalable search-based image annotation. Multimedia Syst 14(4):205–220
Article Google Scholar
Wang XJ, Zhang L, Li XR, Ma W-Y (2008) Annotating images by mining image search results. IEEE Trans Pattern Anal Mach Intell 30(11):1919–1932
Article Google Scholar
Wang M, Hua X-S, Tang J, Hong R (2009) Beyond distance measurement: constructing neighborhood similarity for video annotation. IEEE Trans Multimedia 11(3):465–476
Article Google Scholar
Yang K, Wang M, Zhang H (2009) Active tagging for image indexing. In: Proceedings of the IEEE international conference on multimedia and expo, pp 1620–1623
Zhou X, Wang M, Zhang Q, Zhang J, Shi B (2007) Automatic image annotation by an iterative approach: incorporating keyword correlations and region matching. In: Proceedings of the 6th ACM international conference on image and video retrieval, pp 25–32

Download references

Acknowledgements

This work was supported by the National Natural Science Foundations of China (60973105 and 61003111), and the fund of the State Key Laboratory of Software Development Environment (SKLSDE-2011ZX-03). The authors would like to thank the Editors and the anonymous reviewers 739 for their valuable comments and remarks on the previous versions of this paper.

Author information

Authors and Affiliations

State Key Laboratory of Software Development Environment, Beihang University, Beijing, 100191, China
Xiaoming Zhang, Zhoujun Li & Wenhan Chao
School of Computer Science and Engineering, Beihang University, Beijing, 100191, China
Xiaoming Zhang, Zhoujun Li & Wenhan Chao
Beijing Key Laboratory of Network Technology, Beihang University, Beijing, 100191, China
Xiaoming Zhang, Zhoujun Li & Wenhan Chao

Authors

Xiaoming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhoujun Li
View author publications
You can also search for this author in PubMed Google Scholar
Wenhan Chao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiaoming Zhang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, X., Li, Z. & Chao, W. Improving image tags by exploiting web search results. Multimed Tools Appl 62, 601–631 (2013). https://doi.org/10.1007/s11042-011-0863-5

Download citation

Published: 27 August 2011
Issue Date: February 2013
DOI: https://doi.org/10.1007/s11042-011-0863-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Improving image tags by exploiting web search results

Abstract

Access this article

Similar content being viewed by others

Learning to Prompt for Vision-Language Models

Image Generation: A Review

SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Improving image tags by exploiting web search results

Abstract

Access this article

Similar content being viewed by others

Learning to Prompt for Vision-Language Models

Image Generation: A Review

SegViT v2: Exploring Efficient and Continual Semantic Segmentation with Plain Vision Transformers

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation