Abstract
Many of the proposed approaches to the semantic web have a substantial drawback. They are all based on the idea that web pages (or more generally, resources), will contain semantic annotations that would allow remote agents to access them. However the problem of creating these annotations is seldom addressed. Manual creation of the annotations is not a feasible option, except in a few experimental cases.
We propose an approach based on Language Processing techniques that addresses this issue, at least for textual resources (which still constitute the vast majority of the material available on the web). Documents are analyzed fully automatically and converted into a semantic annotation, which can then be stored together with the original documents. It is this annotation that constitutes the machine understandable resource that remote agents can query. A semi-automatic approach is also considered, in which the system suggests candidate annotations and the user simply has to approve or reject them. Advantages and drawbacks of both approaches are discussed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Black, W.J., Rinaldi, F., Mowatt, D.: FACILE: Description of the NE system used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (1998)
Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema. Technical report, W3C working draft, World Wide Web Consortium, A reference for RDFS (April 2002)
Chinchor, N.: MUC-7 Named Entity Task Definition, Version 3.5 (1997), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/ne_task.html
DAML+OIL (2001), http://www.daml.org/
Dowdall, J., Hess, M., Kahusk, N., Kaljurand, K., Koit, M., Rinaldi, F., Vider, K.: Technical terminology as a critical resource. In: International Conference on Language Resources and Evaluations (LREC 2002), Las Palmas, pp. 1897–1903, May 29-31 (2002)
Frantzi, K.T., Ananiadou, S.: The C/NC value domain inpedented method for multi-word term extraction. Journal of Natural Language Processing 6(3), 145–180 (1999)
GENIA. Genia project home page (2003), http://www-tsujii.is.s.u-tokyo.ac.jp/~genia
Guarino, N.: Formal ontologies in information systems. In: Guarino, N. (ed.) Proceedings of FOIS 1998, Trento, June 1998, pp. 3–15. IOS Press, Amsterdam (1998)
Ingria, B., Pustejovsky, J.: TimeML Specification 1.0 (internal version 3.0.9) (July 2002), http://www.cs.brandeis.edu/%7Ejamesp/arda/time/documentation/TimeML-Draft3.0.9.html
Kageura, K.: The Dynamics of Terminology, A descriptive theory of term formation and terminological growth. In: Terminology and Lexicography, Research and Practice. John Benjamins Publishing, Amsterdam (2002)
Katz, B., Lin, J., Quan, D.: Natural language annotations for the semantic web. In: Meersman, R., Tari, Z., et al. (eds.) CoopIS 2002, DOA 2002, and ODBASE 2002. LNCS, vol. 2519. Springer, Heidelberg (2002)
Kôiti, H.: The GDA Tag Set, http://www.i-content.org/GDA/tagset.html
Kushmerick, N., Weld, D.S., Doorenbos, R.B.: Wrapper induction for information extraction. In: Intl. Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 729–737 (1997)
Lassila, O., Swick, R.R.: Resource description framework (RDF) model and syntax specification. Technical report, W3C (1999), http://www.w3.org/TR/1999/REC-rdf-syntax-19990222
Mollá, D., Schwitter, R., Hess, M., Fournier, R.: ExtrAns, an answer extraction system. T.A.L. Special issue on Information Retrieval oriented Natural Language Processing, 495–522 (2000)
Mollá, D., Schwitter, R., Rinaldi, F., Dowdall, J., Hess, M.: Anaphora resolution in Extrans. In: The 2003 International Symposium on Reference Resolution and Its Applications to Question Answering and Summarization, Venice (June 2003)
Pustejovsky, J., Sauri, R., Setzer, A., Gaizauskas, R., Ingria, B.: TimeML Annotation Guideline 1.00 (internal version 0.4.0) (July 2002), http://www.cs.brandeis.edu/jamesp/arda/time/documentation/TimeML-Draft3.0.9.html
Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1998)
Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Karlsson, M.: The Role of Technical Terminology in Question Answering. In: Proceedings of TIA 2003, Terminologie et Intelligence Artificielle, pp. 156–165, Strasbourg (April 2003), Available at http://www.cl.unizh.ch/CLpublications.html
Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Koit, M., Vider, K., Kahusk, N.: Terminology as Knowledge in Answer Extraction. In: Proceedings of the 6th International Conference on Terminology and Knowledge Engineering (TKE 2002), pp. 107–113, Nancy, August 28-30 (2002)
Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Persidis, A., Theodoulidis, B., Black, B., McNaught, J., Karanikas, H., Vasilakopoulos, A., Zervanou, K., Bernard, L., Zarri, G.P., Slot, H.B., van der Touw, C., Daniel-King, M., Underwood, N., Lisowska, A., van der Plas, L., Sauron, V., Spiliopoulou, M., Brunzel, M., Ellman, J., Orphanos, G., Mavroudakis, T., Taraviras, S.: Parmenides: an opportunity for ISO TC37 SC4? In: The ACL-2003 workshop on Linguistic Annotation, Sapporo, Japan (July 2003)
Rinaldi, F., Dowdall, J., Hess, M., Mollá, D., Schwitter, R.: Towards Answer Extraction: an application to Technical Domains. In: ECAI2002, European Conference on Artificial Intelligence, Lyon, July 21-26, pp. 460–464 (2002)
Rinaldi, F., Dowdall, J., Hess, M., Mollá, D., Schwitter, R., Kaljurand, K.: Knowledge-Based Question Answering. In: Proceedings of KES-2003, Knowledge-Based Intelligent Information and Engineering Systems, Oxford (September 2003) (accepted for publication)
Sleator, D.D., Temperley, D.: Parsing English with a link grammar. In: Proc. Third International Workshop on Parsing Technologies, pp. 277–292 (1993)
TEI Consortium. The text encoding initiative (2003), http://www.tei-c.org/
Voorhees, E.M.: The TREC question answering track. Natural Language Engineering 7(4), 361–378 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Rinaldi, F., Kaljurand, K., Dowdall, J., Hess, M. (2003). Breaking the Deadlock. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds) On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Computer Science, vol 2888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39964-3_55
Download citation
DOI: https://doi.org/10.1007/978-3-540-39964-3_55
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20498-5
Online ISBN: 978-3-540-39964-3
eBook Packages: Springer Book Archive