Abstract
Manually annotating content such as Internet videos, is an intellectually expensive and time consuming process. Furthermore, keywords and community-provided tags lack consistency and present numerous irregularities. Addressing the challenge of simplifying and improving the process of tagging online videos, which is potentially not bounded to any particular domain, we present an algorithm for predicting user-tags from the associated textual metadata in this paper. Our approach is centred around extracting named entities exploiting complementary textual resources such as Wikipedia and Wordnet. More specifically to facilitate the extraction of semantically meaningful tags from a largely unstructured textual corpus we developed a natural language processing framework based on GATE architecture. Extending the functionalities of the in-built GATE named entities, the framework integrates a bag-of-articles algorithm for effectively searching through the Wikipedia articles for extracting relevant articles. The proposed framework has been evaluated against MediaEval 2010 Wild Wild Web dataset, which consists of large collection of Internet videos.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hearst, M.: Automatic acquisition of hyponyms from large text corpora. In: Fourteenth International Conference on Comput. Linguistics, pp. 539–545 (1992)
Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press (1999)
Bast H., Dupret G., Majumdar D., Piwowarski B.: Discovering a Term Taxonomy from Term Similarities Using Principal Component Analysis. Semantic Web Mining (2006)
Cimiano, P., Völker, J.: Text2onto - A Framework for Ontology Learning and Data-Driven Change Discovery. In: Montoyo, A., Muńoz, R., Métais, E. (eds.) NLDB 2005. LNCS, vol. 3513, pp. 227–238. Springer, Heidelberg (2005)
Nemeth, Y., Shapira, B., Taeib-Maimon, M.: Evaluation of the real and perceived value of automatic and interactive query expansion. In: SIGIR (2006)
Shapira B., Taieb-Maimon M., Nemeth Y.: Subjective and objective evaluation of interactive and automatic query expansion. Online Information Review (2005)
Gong, Z., Cheang, C.W., Hou, U.L.: Web Query Expansion by WordNet. In: Andersen, K.V., Debenham, J., Wagner, R. (eds.) DEXA 2005. LNCS, vol. 3588, pp. 166–175. Springer, Heidelberg (2005)
Snow, R., Jurafsky, D., Ng, A.: Learning syntactic patterns for automatic hypernym discovery. In: NIPS (2005)
Nemrava, J.: Refining search queries using WordNet glosses. In: EKAW (2006)
Kliegr, T., Chandramouli, K., Nemrava, J., Svatek, V., Izquierdo, E.: Combining Captions and Visual Analysis for Image Concept Classification. In: Proceedings of the 9h International Workshop on Multimedia Data Mining (2008)
Kliegr, T.: Entity Classification by Bag of Wikipedia Articles. In: Doctoral Consortium, CIKM (2010)
Cucerza, S.: Large-scale named entity disambiguation based on Wikipedia data. In: Proc. of Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (2007)
Budanitsky A., Hirst G.: Evaluating wordnet-based measures of lexical semantic relatedness. Comput. Linguist. (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chandramouli, K., Piatrik, T., Izquierdo, E. (2012). Predicting User Tags Using Semantic Expansion. In: Moschitti, A., Scandariato, R. (eds) Eternal Systems. EternalS 2011. Communications in Computer and Information Science, vol 255. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28033-7_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-28033-7_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28032-0
Online ISBN: 978-3-642-28033-7
eBook Packages: Computer ScienceComputer Science (R0)