Skip to main content

Towards a Semantic Representation of Documents by Ontology-Document Mapping

  • Conference paper
Artificial Intelligence: Methodology, Systems, and Applications (AIMSA 2004)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3192))

Abstract

This paper deals with the use of ontologies in Information Retrieval field. It introduces an approach for document content representation by ontology-document matching. The approach consists in concepts (mono and multiword) detection from a document via a general purpose ontology, namely WordNet. Two criterions are then used: co-occurrence for identifying important concepts in a document, and semantic similarity to compute semantic relatedness between these concepts and then to disambiguate them. The result is a set of scored concepts-senses (nodes) with weighted links called semantic core of document which best represents the semantic content of the document. We regard the proposed and evaluated approach as a short but strong step toward the long term goal of Intelligent Indexing and Semantic Retrieval.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. OntoQuery project net site, http://www.ontoquery.dk

  2. Khan, L., Luo, F.: Ontology Construction for Information Selectio. In: Proc. of 14th IEEE International Conference on Tools with Artificial Intelligence, Washington DC, November 2002, pp. 122–127 (2002)

    Google Scholar 

  3. Guarino, N., Masolo, C., Vetere, G.: OntoSeek: content-based access to the web. IEEE Intelligent Systems 14, 70–80 (1999)

    Google Scholar 

  4. Baziz, M., Aussenac-Gilles, N., et Boughanem, M.: Désambiguïsation et Expansion de Requêtes dans un SRI: Etude de l’apport des liens sémantiques. In: Hermes, V. (ed.) Revue des Sciences et Technologies de l’Information (RSTI) série ISI, December 2003, vol. 8(4/2003), pp. 113–136 (2003)

    Google Scholar 

  5. Mihalcea, R., Moldovan, D.: Semantic indexing using WordNet senses. In: Proceedings of ACL Workshop on IR & NLP, Hong Kong (October 2000)

    Google Scholar 

  6. Miller, G.: Wordnet: A lexical database. Communication of the ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  7. Lee, J.H., Kim, M.H., Lee, Y.J.: Information retrieval based on conceptual distance in IS-A hierarchies. Journal of Documentation 49(2), 188–207 (1993)

    Article  Google Scholar 

  8. Haav, H.M., Lubi, T.-L.: A Survey of Concept-based Information Retrieval Tools on the Web. In: Proc. of 5th East-European Conference ADBIS*2001, Vilnius Technika, vol. 2., pp. 29–41 (2001)

    Google Scholar 

  9. Gonzalo, J., Verdejo, F., Chugur, I., Cigarrán, J.: Indexing with WordNet synsets can improve text retrieval. In: Proc. the COLING/ACL 1998 Workshop on Usage of WordNet for Natural Language Processing (1998)

    Google Scholar 

  10. Zarg Ayouna, H., Salotti, S.: Mesure de similarité dans une ontologie pour l’indexation sémantique de documents XML. In: Dans Ing. des Connais, IC 2004, Lyon Mai, pp. 249–260 (2004)

    Google Scholar 

  11. Cucchiarelli, R., Navigli, F., Neri, P.: Velardi. Extending and Enriching WordNet with OntoLearn. In: Proc. of The Second Global Wordnet Conference 2004 (GWC 2004), Brno, Czech Republic (January 20-23, 2004)

    Google Scholar 

  12. Hirst, G., St. Onge, D.: Lexical chains as representations of context for the detection and correction of malapropisms. In: Fellbaum, C. (ed.) WordNet: An electronic lexical database, pp. 305–332. MIT Press, Cambridge (1998)

    Google Scholar 

  13. Resnik, P.: Semantic Similarity in a Taxonomy: An Information-Based Measure and its Application to Problems of Ambiguity in Natural Language. Journal of Artificial Intelligence Research (JAIR) 11, 95–130 (1999)

    MATH  Google Scholar 

  14. Banerjee, S., Pedersen, T.: An adapted Lesk algorithm for word sense disambiguation using Word-Net. In: Proc. of the Third International Conference on Intelligent Text Processing and Computational Linguistics, Mexico City (February 2002)

    Google Scholar 

  15. Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: How to tell a pine cone from a ice cream cone. In: Proc. of SIGDOC 1986 (1986)

    Google Scholar 

  16. Croft, W.B., Turtle, H.R., Lewis, D.D.: The Use of Phrases and Structured Queries in Information Retrieval. In: Bookstein, A., Chiaramella, Y., Salton, G., Raghavan, V.V. (eds.) Proceedings of the Fourteenth Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Chicago, Illinois, pp. 32–45 (1991)

    Google Scholar 

  17. Huang, X., Robertson, S.E.: Comparisons of Probabilistic Compound Unit Weighting Methods. In: Proc. of the ICDM 2001 Workshop on Text Mining, San Jose, USA (November 2001)

    Google Scholar 

  18. Magnini, B., Cavaglia, G.: Integrating Subject Field Codes into WordNet. In: Proc. of the 2nd International Conference on Language resources and Evaluation, LREC 2000, Atenas (2000)

    Google Scholar 

  19. Boughanem, M., Dkaki, T., Mothe Et, J., SoulÉ-Dupuy, C.: Mercure at TREC-7. In: Proceeding of Trec-7 (1998)

    Google Scholar 

  20. Buitelaar, P., Steffen, D., Volk, M., Widdows, D., Sacaleanu, B., Vintar, S., Peters, S., Uszkoreit, H.: Evaluation Resources for Concept-based Cross-Lingual IR in the Medical Domai. In: Proc. of LREC 2004, Lissabon, Portugal (May 2004)

    Google Scholar 

  21. The Sixth Text REtrieval Conference (TREC{6). Edited by E.M. Voorhees and D.K. Harman. Gaithersburg, MD: NIST (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2004 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Baziz, M. (2004). Towards a Semantic Representation of Documents by Ontology-Document Mapping. In: Bussler, C., Fensel, D. (eds) Artificial Intelligence: Methodology, Systems, and Applications. AIMSA 2004. Lecture Notes in Computer Science(), vol 3192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30106-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-30106-6_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-22959-9

  • Online ISBN: 978-3-540-30106-6

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics