Scalable Semantic Annotation of Text Using Lexical and Web Resources

Zavitsanos, Elias; Tsatsaronis, George; Varlamis, Iraklis; Paliouras, Georgios

doi:10.1007/978-3-642-12842-4_32

Elias Zavitsanos²¹,
George Tsatsaronis²²,
Iraklis Varlamis²³ &
…
Georgios Paliouras²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6040))

Included in the following conference series:

Hellenic Conference on Artificial Intelligence

2104 Accesses
5 Citations

Abstract

In this paper we are dealing with the task of adding domain-specific semantic tags to a document, based solely on the domain ontology and generic lexical and Web resources. In this manner, we avoid the need for trained domain-specific lexical resources, which hinder the scalability of semantic annotation. More specifically, the proposed method maps the content of the document to concepts of the ontology, using the WordNet lexicon and Wikipedia. The method comprises a novel combination of measures of semantic relatedness and word sense disambiguation techniques to identify the most related ontology concepts for the document. We test the method on two case studies: (a) a set of summaries, accompanying environmental news videos, (b) a set of medical abstracts. The results in both cases show that the proposed method achieves reasonable performance, thus pointing to a promising path for scalable semantic annotation of documents.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agirre, E., Rigau, G.: A proposal for word sense disambiguation using conceptual distance. In: International Conference on Recent Advances in NLP (1995)
Google Scholar
Budanitsky, A., Hirst, G.: Evaluating wordnet-based measures of lexical semantic relatedness. Computational Linguistics 32(1), 13–47 (2006)
Article Google Scholar
Cimiano, P., Ladwig, G., Staab, S.: Gimme’ the context: context-driven automatic semantic annotation with c-pankow. In: WWW, pp. 332–341 (2005)
Google Scholar
Ding, Y., Embley, D.W.: Using data-extraction ontologies to foster automating semantic annotation. In: ICDE Workshops (2006)
Google Scholar
El-Beltagy, S.R., Hazman, M., Rafea, A.A.: Ontology based annotation of text segments. In: SAC (2007)
Google Scholar
Erdmann, M., Maedche, A., Schnurr, H.P., Staab, S.: From manual to semi-automatic semantic annotation: About ontology-based text annotation tools. ETAI Journal - Section on Semantic Web 6(2) (2001)
Google Scholar
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using wikipedia-based explicit semantic analysis. In: IJCAI, pp. 1606–1611 (2007)
Google Scholar
Laclavik, M., Seleng, M., Gatial, E., Balogh, Z., Hluchý, L.: Ontology based text annotation - ontea. In: EJC (2006)
Google Scholar
Leacock, C., Miller, G., Chodorow, M.: Using corpus statistics and wordnet relations for sense identification. Computational Linguistics 24(1), 147–165 (1998)
Google Scholar
Lesk, M.: Automated sense disambiguation using machine-readable dictionaries: How to tell a pine cone from an ice cream cone. In: SIGDOC (1986)
Google Scholar
Milne, D., Witten, I.H.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: AAAI Workshop on Wikipedia and Artificial Intelligence (2008)
Google Scholar
Patwardhan, S., Pedersen, T.: Using wordnet based context vectors to estimate the semantic relatedness of concepts. In: EACL 2006 Workshop Making Sense of Sense - Bringing Computational Linguistics and Psycholinguistics Together (2006)
Google Scholar
Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)
MATH Google Scholar
Tsatsaronis, G., Varlamis, I., Nørvåg, K.: An experimental study on unsupervised graph-based word sense disambiguation. In: CICLing (2010)
Google Scholar
Tsatsaronis, G., Varlamis, I., Nørvåg, K., Vazirgiannis, M.: Omiotis: A thesaurus-based measure of text relatedness. In: ECML-PKDD (2009)
Google Scholar
Tsatsaronis, G., Varlamis, I., Vazirgiannis, M.: Text relatedness based on a word thesaurus. Journal of Artificial Intelligence Research 37, 1–39 (2010)
MATH Google Scholar
Yarowsky, D.: Word-sense disambiguation using statistical models of roget’s categories trained on large corpora. In: Int. Conf. on Compuitational Linguistics (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute of Informatics & Telecommunications, NCSR “Demokritos”,
Elias Zavitsanos & Georgios Paliouras
Department of Computer and Information Science, Norwegian University of Science and Technology,
George Tsatsaronis
Department of Informatics and Telematics, Harokopio University,
Iraklis Varlamis

Authors

Elias Zavitsanos
View author publications
You can also search for this author in PubMed Google Scholar
George Tsatsaronis
View author publications
You can also search for this author in PubMed Google Scholar
Iraklis Varlamis
View author publications
You can also search for this author in PubMed Google Scholar
Georgios Paliouras
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Informatics and Telecommunications, NCSR Demokritos, Ag. Paraskevi, 15310, Athens, Greece
Stasinos Konstantopoulos , Stavros Perantonis , Vangelis Karkaletsis & Constantine D. Spyropoulos , , &
Department of Information and Communication Systems Engineering, University of the Aegean, 83200, Karlovassi, Samos, Greece
George Vouros

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zavitsanos, E., Tsatsaronis, G., Varlamis, I., Paliouras, G. (2010). Scalable Semantic Annotation of Text Using Lexical and Web Resources. In: Konstantopoulos, S., Perantonis, S., Karkaletsis, V., Spyropoulos, C.D., Vouros, G. (eds) Artificial Intelligence: Theories, Models and Applications. SETN 2010. Lecture Notes in Computer Science(), vol 6040. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12842-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-12842-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12841-7
Online ISBN: 978-3-642-12842-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics