Abstract
State-of-the-art content sharing platforms often require users to assign tags to pieces of media in order to make them easily retrievable. Since this task is sometimes perceived as tedious or boring, annotations can be sparse. Commenting on the other hand is a frequently used means of expressing user opinion towards shared media items. This work makes use of time series analyses in order to infer potential tags and indexing terms for audio-visual content from user comments. In this way, we mitigate the vocabulary gap between queries and document descriptors. Additionally, we show how large-scale encyclopaedias such as Wikipedia can aid the task of tag prediction by serving as surrogates for high-coverage natural language vocabulary lists. Our evaluation is conducted on a corpus of several million real-world user comments from the popular video sharing platform YouTube, and demonstrates significant improvements in retrieval performance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Alonso, O., Mizzaro, S.: Can we get rid of trec assessors? using mechanical turk for relevance assessment. In: SIGIR 2009 Workshop on the Future of IR Evaluation (2009)
Ames, M., Naaman, M.: Why we tag: motivations for annotation in mobile and online media. In: SIGCHI 2007 (2007)
Amodeo, G., Amati, G., Gambosi, G.: On relevance, time and query expansion. In: CIKM 2011 (2011)
Budura, A., Michel, S., Cudré-Mauroux, P., Aberer, K.: Neighborhood-Based Tag Prediction. In: Aroyo, L., Traverso, P., Ciravegna, F., Cimiano, P., Heath, T., Hyvönen, E., Mizoguchi, R., Oren, E., Sabou, M., Simperl, E. (eds.) ESWC 2009. LNCS, vol. 5554, pp. 608–622. Springer, Heidelberg (2009)
Cheng, X., Dale, C., Liu, J.: Understanding the Characteristics of Internet Short Video Sharing: YouTube as a Case Study. ArXiv e-prints (2007)
Eck, D., Lamere, P., Bertin-Mahieux, T., Green, S.: Automatic generation of social tags for music recommendation. NIPS 20 (2007)
Eickhoff, C., Harris, C.G., de Vries, A.P., Srinivasan, P.: Quality through flow and immersion: gamifying crowdsourced relevance assessments. In: SIGIR 2012 (2012)
Filippova, K., Hall, K.B.: Improved video categorization from text metadata and user comments. In: SIGIR 2011 (2011)
Heymann, P., Ramage, D., Garcia-Molina, H.: Social tag prediction. In: SIGIR 2008 (2008)
Hiemstra, D.: A probabilistic justification for using tf× idf term weighting in information retrieval. JDL 2000 (2000)
Hu, M., Sun, A., Lim, E.P.: Comments-oriented blog summarization by sentence extraction. In: CIKM 2007 (2007)
Kazai, G.: In Search of Quality in Crowdsourcing for Search Engine Evaluation. In: Clough, P., Foley, C., Gurrin, C., Jones, G.J.F., Kraaij, W., Lee, H., Mudoch, V. (eds.) ECIR 2011. LNCS, vol. 6611, pp. 165–176. Springer, Heidelberg (2011)
Kleinberg, J.: Bursty and hierarchical structure in streams. Data Mining and Knowledge Discovery (4) (2003)
Larson, M., et al.: Automatic tagging and geotagging in video collections and communities. In: ICMR 2011 (2011)
Matsuo, Y., Ishizuka, M.: Keyword extraction from a single document using word co-occurrence statistical information. International Journal on Artificial Intelligence Tools (1) (2004)
Mishne, G., Glance, N.: Leave a reply: An analysis of weblog comments. In: WWE 2006 (2006)
Oghina, A., Breuss, M., Tsagkias, M., de Rijke, M.: Predicting IMDB Movie Ratings Using Social Media. In: Baeza-Yates, R., de Vries, A.P., Zaragoza, H., Cambazoglu, B.B., Murdock, V., Lempel, R., Silvestri, F. (eds.) ECIR 2012. LNCS, vol. 7224, pp. 503–507. Springer, Heidelberg (2012)
Ogilvie, P., Callan, J.: Combining document representations for known-item search. In: SIGIR 2003 (2003)
Robertson, S., Zaragoza, H., Taylor, M.: Simple bm25 extension to multiple weighted fields. In: CIKM 2004 (2004)
Siersdorfer, S., San Pedro, J., Sanderson, M.: Automatic video tagging using content redundancy. In: SIGIR 2009 (2009)
Tomokiyo, T., Hurst, M.: A language model approach to keyphrase extraction. In: ACL, Workshop on Multiword Expressions: Analysis, Acquisition and Treatment (2003)
Wartena, C., Brussee, R., Slakhorst, W.: Keyword extraction using word co-occurrence. In: Database and Expert Systems Applications, DEXA (2010)
Wu, L., et al.: Distance metric learning from uncertain side information with application to automated photo tagging. In: ACM Multimedia 2009 (2009)
Yee, W.G., Yates, A., Liu, S., Frieder, O.: Are web user comments useful for search? In: Proc. LSDS-IR (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Eickhoff, C., Li, W., de Vries, A.P. (2013). Exploiting User Comments for Audio-Visual Content Indexing and Retrieval. In: Serdyukov, P., et al. Advances in Information Retrieval. ECIR 2013. Lecture Notes in Computer Science, vol 7814. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36973-5_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-36973-5_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36972-8
Online ISBN: 978-3-642-36973-5
eBook Packages: Computer ScienceComputer Science (R0)