Abstract
Today’s search engines build their indices on the basis of document mark-up in XML and significant letter sequences (words) occurring in the document texts. There are some drawbacks, however: the XML mark-up requires skill as well as tedious work from the user posting the document, and the indexing based on significant word distributions, though automatic and highly effective, is not as precise as required by many applications.
As a complement to current methods, this paper presents an automatic content analysis of texts which is based on traditional linguistic methods in conjunction with a comparatively new data structure ([6]) and algorithm ([3]). Having already presented the formal definitions elsewhere, we aim here at illustrating the system in action, based on an ongoing implementation in JAVA.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Abney, S.: Parsing by Chunks. In: Berwick, R., Abney, S., Tenny, C. (eds.) Principle-Based Parsing. Kluwer Academic Publishers, Dordrecht (1991)
Berners-Lee, T., Hendler, J., Lassila, O.: The Semantic Web. Scientific American 284(5) (2001)
Hausser, R.: Complexity in Left-Associative Grammar. Theoretical Computer Science 106(2), 283–308 (1992)
Hausser, R. (ed.): Linguistische Verifikation. Dokumentation zur Ersten Morpholympics. Max Niemeyer Verlag, Tübingen (1996)
Hausser, R.: Foundations of Computational Linguistics. In: Human-Computer Communication in Natural Language, 2nd edn. Springer, Berlin (1999/2001)
Hausser, R.: Database Semantics for Natural Language. Artificial Intelligence 130(1), 27–74 (2001)
Hausser, R.: Turn Taking in Database Semantics. In: Kangassalo, H., et al. (eds.) Information Modeling and Knowledge Bases XVI. IOS Press, Amsterdam (2005) (to appear)
Kycia, A.: Implementierung der Datenbanksemantik in JAVA. MA-thesis. Universität Erlangen-Nürnberg (2004)
Vergne, J.: Une méthode pour l’analyse descendante et calculatoire de corpus multilingues: application au calcul des relations sujet-verbe. Actes de TALN, 63–74 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hausser, R. (2004). Applying Database Semantics to the WWW. In: Bussler, C., et al. Web Information Systems – WISE 2004 Workshops. WISE 2004. Lecture Notes in Computer Science, vol 3307. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30481-4_21
Download citation
DOI: https://doi.org/10.1007/978-3-540-30481-4_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23892-8
Online ISBN: 978-3-540-30481-4
eBook Packages: Springer Book Archive