Abstract
In this paper we are dealing with the task of adding domain-specific semantic tags to a document, based solely on the domain ontology and generic lexical and Web resources. In this manner, we avoid the need for trained domain-specific lexical resources, which hinder the scalability of semantic annotation. More specifically, the proposed method maps the content of the document to concepts of the ontology, using the WordNet lexicon and Wikipedia. The method comprises a novel combination of measures of semantic relatedness and word sense disambiguation techniques to identify the most related ontology concepts for the document. We test the method on two case studies: (a) a set of summaries, accompanying environmental news videos, (b) a set of medical abstracts. The results in both cases show that the proposed method achieves reasonable performance, thus pointing to a promising path for scalable semantic annotation of documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agirre, E., Rigau, G.: A proposal for word sense disambiguation using conceptual distance. In: International Conference on Recent Advances in NLP (1995)
Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)
Cimiano, P., Ladwig, G., Staab, S.: Gimme’ the context: context-driven automatic semantic annotation with c-pankow. In: WWW, pp. 332–341 (2005)
Ding, Y., Embley, D.W.: Using data-extraction ontologies to foster automating semantic annotation. In: ICDE Workshops (2006)
El-Beltagy, S.R., Hazman, M., Rafea, A.A.: Ontology based annotation of text segments. In: SAC (2007)
Erdmann, M., Maedche, A., Schnurr, H.P., Staab, S.: From manual to semi-automatic semantic annotation: About ontology-based text annotation tools. ETAI Journal - Section on Semantic Web 6(2) (2001)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: IJCAI, pp. 1606–1611 (2007)
Laclavik, M., Seleng, M., Gatial, E., Balogh, Z., Hluchý, L.: Ontology based text annotation - ontea. In: EJC (2006)
Leacock, C., Miller, G., Chodorow, M.: Using corpus statistics and wordnet relations for sense identification. Computational Linguistics 24(1), 147–165 (1998)
Lesk, M.: Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone. In: SIGDOC (1986)
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: AAAI Workshop on Wikipedia and Artificial Intelligence (2008)
Patwardhan, S., Pedersen, T.: Using wordnet based context vectors to estimate the semantic relatedness of concepts. In: EACL 2006 Workshop Making Sense of Sense - Bringing Computational Linguistics and Psycholinguistics Together (2006)
Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)
Tsatsaronis, G., Varlamis, I., Nørvåg, K.: An experimental study on unsupervised graph-based word sense disambiguation. In: CICLing (2010)
Tsatsaronis, G., Varlamis, I., Nørvåg, K., Vazirgiannis, M.: Omiotis: A thesaurus-based measure of text relatedness. In: ECML-PKDD (2009)
Tsatsaronis, G., Varlamis, I., Vazirgiannis, M.: Text relatedness based on a word thesaurus. Journal of Artificial Intelligence Research 37, 1–39 (2010)
Yarowsky, D.: Word-sense disambiguation using statistical models of roget’s categories trained on large corpora. In: Int. Conf. on Compuitational Linguistics (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zavitsanos, E., Tsatsaronis, G., Varlamis, I., Paliouras, G. (2010). Scalable Semantic Annotation of Text Using Lexical and Web Resources. In: Konstantopoulos, S., Perantonis, S., Karkaletsis, V., Spyropoulos, C.D., Vouros, G. (eds) Artificial Intelligence: Theories, Models and Applications. SETN 2010. Lecture Notes in Computer Science(), vol 6040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12842-4_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-12842-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12841-7
Online ISBN: 978-3-642-12842-4
eBook Packages: Computer ScienceComputer Science (R0)