Abstract
This paper presents an approach to automating semantic annotation within service-oriented architectures that provide interfaces to databases of spatialinformation objects. The automation of the annotation process facilitates the transition from the current state-of-the-art architectures towards semantically-enabled architectures. We see the annotation process as the task of matching an arbitrary word or term with the most appropriate concept in the domain ontology. The term matching techniques that we present are based on text mining. To determine the similarity between two terms, we first associate a set of documents [that we obtain from a Web search engine] with each term. We then transform the documents into feature vectors and thus transition the similarity assessment into the feature space. After that, we compute the similarity by training a classifier to distinguish between ontology concepts. Apart from text mining approaches, we also present an alternative technique, namely Google Distance, which proves less suitable for our task. The paper also presents the results of an extensive evaluation of the presented term matching methodswhich shows that these methodswork best on synonymous nouns from a specific vocabulary. Furthermore, the fast and simple centroid-based classifier is shown to perform very well for this task. The main contribution of this paper is thus in proposing a term matching algorithm based on text mining and information retrieval. Furthermore, the presented evaluation should give a notion of how the algorithm performs in various scenarios.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Cardoso-Cachopo, A., Oliveira, L.A.: Empirical Evaluation of Centroid-based Models for Single-label Text Categorization. INSEC-ID Technical Report 7/2006 (2006)
Cilibrasi, R., Vitanyi, P.: Automatic Meaning Discovery Using Google (2004)
de Bruijn, J.: The Web Service Modeling Language WSML (2005)
Public STINET (Scientific & Technical Information Network), http://stinet.dtic.mil/str/thesaurus.html
Etzioni, O., Cafarella, M., Downey, D., et al.: Web-scale Information Extraction in KnowItAll (Preliminary Results). In: Proceedings of WWW 2004, New York, USA (2004)
Grcar, M., Klien, E., Fitzner, D.I., Maué, P., Mladenic, D., Grobelnik, M.: D4.1: Representational Language for Web-service Annotation Models. Project Report FP6-026514 SWING, WP 4, D4.1 (2006)
GEMET Thesaurus (General Multilingual Environmental Thesaurus), http://www.eionet.europa.eu/gemet/
Mitchell, T.M.: Machine Learning. The McGraw-Hill Companies, Inc., New York (1997)
Open Geospatial Consortium: Web Feature Service Implementation Specification, Version 1.0.0 (OGC Implementation Specification 02-058) (2002)
WordNet, a lexical database of English, http://wordnet.princeton.edu/
Vapnik, V.: Statistical Learning Theory. Wiley, New York (1998)
Gligorov, R., Aleksovski, Z., ten Warner, K., van Harmelen, F.: Using Google Distance to Weight Approximate Ontology Matches. In: Proceedings of WWW 2007, Banff, Alberta, Canada (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Grčar, M., Klien, E., Novak, B. (2009). Using Term-Matching Algorithms for the Annotation of Geo-services. In: Berendt, B., et al. Knowledge Discovery Enhanced with Semantic and Social Information. Studies in Computational Intelligence, vol 220. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01891-6_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-01891-6_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01890-9
Online ISBN: 978-3-642-01891-6
eBook Packages: EngineeringEngineering (R0)