Skip to main content

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2888))

Abstract

Many of the proposed approaches to the semantic web have a substantial drawback. They are all based on the idea that web pages (or more generally, resources), will contain semantic annotations that would allow remote agents to access them. However the problem of creating these annotations is seldom addressed. Manual creation of the annotations is not a feasible option, except in a few experimental cases.

We propose an approach based on Language Processing techniques that addresses this issue, at least for textual resources (which still constitute the vast majority of the material available on the web). Documents are analyzed fully automatically and converted into a semantic annotation, which can then be stored together with the original documents. It is this annotation that constitutes the machine understandable resource that remote agents can query. A semi-automatic approach is also considered, in which the system suggests candidate annotations and the user simply has to approve or reject them. Advantages and drawbacks of both approaches are discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Black, W.J., Rinaldi, F., Mowatt, D.: FACILE: Description of the NE system used for MUC-7. In: Proceedings of the 7th Message Understanding Conference (1998)

    Google Scholar 

  2. Brickley, D., Guha, R.V.: RDF vocabulary description language 1.0: RDF Schema. Technical report, W3C working draft, World Wide Web Consortium, A reference for RDFS (April 2002)

    Google Scholar 

  3. Chinchor, N.: MUC-7 Named Entity Task Definition, Version 3.5 (1997), http://www.itl.nist.gov/iaui/894.02/related_projects/muc/proceedings/ne_task.html

  4. DAML+OIL (2001), http://www.daml.org/

  5. Dowdall, J., Hess, M., Kahusk, N., Kaljurand, K., Koit, M., Rinaldi, F., Vider, K.: Technical terminology as a critical resource. In: International Conference on Language Resources and Evaluations (LREC 2002), Las Palmas, pp. 1897–1903, May 29-31 (2002)

    Google Scholar 

  6. Frantzi, K.T., Ananiadou, S.: The C/NC value domain inpedented method for multi-word term extraction. Journal of Natural Language Processing 6(3), 145–180 (1999)

    Google Scholar 

  7. GENIA. Genia project home page (2003), http://www-tsujii.is.s.u-tokyo.ac.jp/~genia

  8. Guarino, N.: Formal ontologies in information systems. In: Guarino, N. (ed.) Proceedings of FOIS 1998, Trento, June 1998, pp. 3–15. IOS Press, Amsterdam (1998)

    Google Scholar 

  9. Ingria, B., Pustejovsky, J.: TimeML Specification 1.0 (internal version 3.0.9) (July 2002), http://www.cs.brandeis.edu/%7Ejamesp/arda/time/documentation/TimeML-Draft3.0.9.html

  10. Kageura, K.: The Dynamics of Terminology, A descriptive theory of term formation and terminological growth. In: Terminology and Lexicography, Research and Practice. John Benjamins Publishing, Amsterdam (2002)

    Google Scholar 

  11. Katz, B., Lin, J., Quan, D.: Natural language annotations for the semantic web. In: Meersman, R., Tari, Z., et al. (eds.) CoopIS 2002, DOA 2002, and ODBASE 2002. LNCS, vol. 2519. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  12. Kôiti, H.: The GDA Tag Set, http://www.i-content.org/GDA/tagset.html

  13. Kushmerick, N., Weld, D.S., Doorenbos, R.B.: Wrapper induction for information extraction. In: Intl. Joint Conference on Artificial Intelligence (IJCAI 1997), pp. 729–737 (1997)

    Google Scholar 

  14. Lassila, O., Swick, R.R.: Resource description framework (RDF) model and syntax specification. Technical report, W3C (1999), http://www.w3.org/TR/1999/REC-rdf-syntax-19990222

  15. Mollá, D., Schwitter, R., Hess, M., Fournier, R.: ExtrAns, an answer extraction system. T.A.L. Special issue on Information Retrieval oriented Natural Language Processing, 495–522 (2000)

    Google Scholar 

  16. Mollá, D., Schwitter, R., Rinaldi, F., Dowdall, J., Hess, M.: Anaphora resolution in Extrans. In: The 2003 International Symposium on Reference Resolution and Its Applications to Question Answering and Summarization, Venice (June 2003)

    Google Scholar 

  17. Pustejovsky, J., Sauri, R., Setzer, A., Gaizauskas, R., Ingria, B.: TimeML Annotation Guideline 1.00 (internal version 0.4.0) (July 2002), http://www.cs.brandeis.edu/jamesp/arda/time/documentation/TimeML-Draft3.0.9.html

  18. Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1998)

    Google Scholar 

  19. Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Karlsson, M.: The Role of Technical Terminology in Question Answering. In: Proceedings of TIA 2003, Terminologie et Intelligence Artificielle, pp. 156–165, Strasbourg (April 2003), Available at http://www.cl.unizh.ch/CLpublications.html

  20. Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Koit, M., Vider, K., Kahusk, N.: Terminology as Knowledge in Answer Extraction. In: Proceedings of the 6th International Conference on Terminology and Knowledge Engineering (TKE 2002), pp. 107–113, Nancy, August 28-30 (2002)

    Google Scholar 

  21. Rinaldi, F., Dowdall, J., Hess, M., Kaljurand, K., Persidis, A., Theodoulidis, B., Black, B., McNaught, J., Karanikas, H., Vasilakopoulos, A., Zervanou, K., Bernard, L., Zarri, G.P., Slot, H.B., van der Touw, C., Daniel-King, M., Underwood, N., Lisowska, A., van der Plas, L., Sauron, V., Spiliopoulou, M., Brunzel, M., Ellman, J., Orphanos, G., Mavroudakis, T., Taraviras, S.: Parmenides: an opportunity for ISO TC37 SC4? In: The ACL-2003 workshop on Linguistic Annotation, Sapporo, Japan (July 2003)

    Google Scholar 

  22. Rinaldi, F., Dowdall, J., Hess, M., Mollá, D., Schwitter, R.: Towards Answer Extraction: an application to Technical Domains. In: ECAI2002, European Conference on Artificial Intelligence, Lyon, July 21-26, pp. 460–464 (2002)

    Google Scholar 

  23. Rinaldi, F., Dowdall, J., Hess, M., Mollá, D., Schwitter, R., Kaljurand, K.: Knowledge-Based Question Answering. In: Proceedings of KES-2003, Knowledge-Based Intelligent Information and Engineering Systems, Oxford (September 2003) (accepted for publication)

    Google Scholar 

  24. Sleator, D.D., Temperley, D.: Parsing English with a link grammar. In: Proc. Third International Workshop on Parsing Technologies, pp. 277–292 (1993)

    Google Scholar 

  25. TEI Consortium. The text encoding initiative (2003), http://www.tei-c.org/

  26. Voorhees, E.M.: The TREC question answering track. Natural Language Engineering 7(4), 361–378 (2001)

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2003 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Rinaldi, F., Kaljurand, K., Dowdall, J., Hess, M. (2003). Breaking the Deadlock. In: Meersman, R., Tari, Z., Schmidt, D.C. (eds) On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE. OTM 2003. Lecture Notes in Computer Science, vol 2888. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39964-3_55

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-39964-3_55

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-20498-5

  • Online ISBN: 978-3-540-39964-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics