Abstract
Tags associated with web videos play a crucial role in organizing and accessing large-scale video collections. However, the raw tag list (RawL) is usually incomplete, imprecise and unranked, which reduces the usability of tags. Meanwhile, compared with studies on improving the quality of web image tags, tags associated with web videos are not studied to the same extent. In this paper, we propose a novel web video tag enhancement approach called video retagging, which aims at producing the more complete, precise, and ranked retagged tag list (RetL) for web videos. Given a web video, video retagging first collect its textually and visually related neighbor videos. All tags attached to the neighbors are treated as possible relevant ones and then RetL is generated by inferring the degree of relevance of the tags from both global and video-specific perspectives, using two different graph based models. Two kinds of experiments, i.e., application-oriented video search and categorization and user-based subjective studies are carried out on a large-scale web video dataset, which demonstrate that in most cases, RetL is better than RawL in terms of completeness, precision and ranking.








Similar content being viewed by others
Notes
The dataset has been updated and now contains more than 200 k web videos.
References
Ahn LV, Dabbish L (2004) Labeling images with a computer game. In: Proceedings of CHI 2004, pp 319–326
Ames M, Naaman M (2007) Why we tag: motivations for annotation in mobile and online media. In: Proceedings of CHI 2007, pp 971–980
Cao J, Zhang YD, Song YC, Chen ZN, Zhang X, Li JT (2009) MCG-WEBV: a benchmark dataset for web video analysis. Technical Report, Institute of Computing Technology
Chang SF, He JF, Jiang YG, Khoury EE, Ngo CW, Yanagawa A, Zavesky E (2008) Columbia University/VIREO-CityU/IRIT TERCVID 2008. In: Proceedings of TRECVID Workshop
Chen Z.N, Cao J, Song Y.C, Guo J.B, Zhang Y.D, Li J.T (2010) Context-oriented web video tag recommendation. In: Proceedings of ACM WWW 2010, pp 1079–1080
Chua TS, Tang JH, Hong R, Li H, Luo Z, Zheng YT (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: Proceedings of ACM CIVR 2009
Datta R, Joshi D, Li J, Wang JZ (2008) Image retrieval: ideas, influences, and trends of the new age. ACM Comput Surv 40(2):1–60
Fogarty J, Tan DS, Kapoor A, Winder S (2008) CueFlik: interactive concept learning in image search. In: Proceedings of CHI 2008, pp 29–38
Frey BJ, Dueck D (2007) Clustering by passing messages between data points. Science 315(5814):972–976
Guy M, Tonkin E (2006) Folksonomies tidying up tags. D-Lib Magazine 12(1). Retrieved from: http://www.dlib.org/dlib/january06/guy/01guy.html
Halvey MJ, Jose M (2010) Towards annotation of video as part of search. In: Proceedings of advance in multimedia modeling 2010, pp 400–410
Halvey MJ, Keane MT (2007) Analysis of online video search and sharing. In: Proceedings of HT 2007, pp.217–226
Hatcher E, Gospodnetic O, McCandless M (2008) Lucene in action, second edition. Manning Publications, 2008
Hsu W, Kennedy LS, Chang SF (2007) Video search reranking through random walk over document-level context graph. In: Proceedings of ACM international conference on multimedia, pp.971–980
Jin Y, Khan L, Wang L, Awad M (2005) Image annotations by combining multiple evidence & Wordnet. In: Proceedings of ACM international conference on multimedia, pp 706–715
Kennedy LS, Chang SF, Kozintsev IV (2006) To search or to label? Predicting the performance of search-based automatic image classifiers. In: Proceedings of ACM Conference on Multimedia Information Retrieval 2006, pp 249–258
Li X, Snoek CCM (2009) Visual categorization with negative examples for free. In Proceedings of the ACM international conference on multimedia, pp 661–664
Li X, Snoek CGM, Worring M (2009) Learning social tag relevance by neighbor voting. IEEE Trans Multimedia 11(7):1310–1322
Lindstaedt S, Morzinger R, Sorschag R, Pammer V, Thallinger G (2009) Automatic image annotation using visual content and folksonomies. Multimed Tool Appl 42:97–113
Liu L, Sun L, Rui Y, Shi Y, Yang S (2008) Web video topic discovery and tracking via bipartite graph reinforcement model. In: Proceedings of ACM WWW 2008, pp 1009–1018
Liu D, Hua XS, Yang L. Wang M, Zhang HJ (2009) Tag ranking. In: Proceedings of ACM WWW 2009, pp 351–360
Moxley E, Mei T, Manjunath BS (2010) Video annotation through search and graph reinforcement mining. IEEE Trans Multimedia 12(3):184–193
Naphade MR, Smith JR, Tesic J, Chang SF, Hsu W, Kennedy L, Hauptmann A, Curtis J (2006) Large-scale concept ontology for multimedia. In Proceedings of ACM international conference on multimedia, pp 86–91
Page L, Brin S, Motwani R, Winograd T (1998) The pagerank citation ranking: Bringing order to the web. Tech. Rep. 1999-66, Stanford University. Available on the Internet at http://dbpubs.stanford.edu:8090/pub/1999-66
Qi GJ, Hua XS, Rui Y, Tang J H, Mei T, Zhang HJ (2007) Correlative multi-label video annotation. In: Proceedings of ACM international conference on multimedia, pp 17–26
Rui X, Li M, Li Z, Ma WY, Yu N (2007) Bipartite graph reinforcement model for web image annotation. In: Proceedings of ACM international conference on multimedia, pp 585–594
Setz AT, Snoek CCM (2009) Can social tagged image aid concept-based video search? In: Proceedings of the IEEE international conference on Multimedia & Expo, pp 1460–1463
Sevil SG, Kucuktunc O, Duygulu P, Can F (2010) Automatic tag expansion using visual similarity for photo sharing websites. Multimed Tool Appl 49:81–99
Siersdorfer S, Pedro JS, Sanderson M (2009) Automatic video tagging using content redundancy. In: Proceedings of ACM SIGIR’09, pp 395–402
Sigurbjornsson B, Zwol RV (2008) Flickr tag recommendation based on Collective knowledge. In: Proceedings of ACM WWW 2008, pp 327–336
Smeaton AF, Over P, Kraaij W (2009) High-level feature detection from video in TRECVid: a 5-year retrospective of achievements, multimedia content analysis, theory and applications, pp 151–174
Tang S, Li JT, Li M, Xie C, Liu YZ, Tao K, Xu SX (2008) TRECVID 2008 high-level feature extraction by MCG-ICT-CAS. In: Proceedings of TRECVID Workshop
Tang JH, Yan SC, Hong RC, Qi GJ, Chua TS (2009) Inferring semantic concepts from community-contributed images and noisy tags. In: Proceedings of ACM international conference on multimedia, pp 223–232
Tsikrika T, Diou C, Vries APD, Delopoulos A (2009) Image annotation using clickthrough data. In: Proceedings of ACM CIVR 2009
Ulges A, Schulze C, Keysers D, Breuel TM (2008) A system that learns to tag videos by watching YouTube. In: Proceedings of the 6th International Conference of Computer Vision Systems, pp 415–424
Wang C, Jing F, Zhang L, Zhang HJ (2006) Image annotation refinement using random walk with restarts. In: Proceedings of ACM multimedia, pp 647–650, New York, USA
Wang C, Jing F, Zhang L, Zhang HJ (2007) Content-based image annotation refinement. In: Proceedings of IEEE CVPR 2007, Minneapolis, Minnesota, pp 1–8
Wu X, Hauptmann AG, Ngo CW (2007) Practical elimination of Near-duplicates from web video search. In: Proceedings of ACM international conference on multimedia, pp 218–227
Wu L, Yang L, Yu N, Hua XS (2009) Learning to tag. In: Proceedings of ACM WWW 2009, pp 361–370
Xu H, Wang J, Hua XS, Li S (2009) Tag refinement by regularized LDA. In: Proceedings of ACM international conference on multimedia, pp 573–576
Yang L, Liu J, Yang X, Hua XS (2007) Multi-modality web video categorization. In: Proceedings of ACM Conference on Multimedia Information Retrieval 2007, pp 265–274
Zhao WL, Wu X, Ngo CW (2010) On the annotation of web videos by efficient near-duplicate search. IEEE Trans Multimed 12(5):448–461
Zhou T, Ren J, Medo M, Zhang YC (2007) Bipartite network projection and personal recommendation. Phys Rev E 76(4):1–7
Acknowledgments
This work was supported by the National Basic Research Program of China (973 Program, 2007CB311100), National Nature Science Foundation of China (60902090), Beijing New Star Project on Science & Technology (2007B071), Co-building Program of Beijing Municipal Education Commission.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, Z., Cao, J., Xia, T. et al. Web video retagging. Multimed Tools Appl 55, 53–82 (2011). https://doi.org/10.1007/s11042-010-0604-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0604-1