Skip to main content

An Abstraction-Based Data Model for Information Retrieval

  • Conference paper
AI 2009: Advances in Artificial Intelligence (AI 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5866))

Included in the following conference series:

Abstract

Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of the 2nd International Conference on Information and Knowledge Management, pp. 67–74 (1993)

    Google Scholar 

  2. Dictionary.com, http://www.dictionary.com

  3. Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (1997)

    Google Scholar 

  4. Pedersen, T., Banerjee, S., Padwardhan, S.: Maximizing semantic relatedness to perform word sense disambiguation (February 2009), citeseer.ist.psu.edu/pedersen03maximizing.html

  5. Wan, S., Angryk, R.: Measuring semantic similarity using wordnet-based context vectors. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics (2007)

    Google Scholar 

  6. Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)

    MATH  Google Scholar 

  7. Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304 (1998)

    Google Scholar 

  8. Widdows, D., Dorow, B.: A graph model for unsupervised lexical acquisition. In: 19th International conference on Computational Linguistics, pp. 1093–1099 (2002)

    Google Scholar 

  9. Feldman, R., Sanger, J.: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2006)

    Google Scholar 

  10. Wordnet: a lexical database for the english language, http://wordnet.princeton.edu/

  11. Hossain, M.S., Angryk, R.A.: Gdclust: A graph-based document clustering technique. In: ICDM Workshops, pp. 417–422. IEEE Computer Society, Los Alamitos (2007), http://dblp.uni-trier.de/db/conf/icdm/icdmw2007.html#HossainA07

    Google Scholar 

  12. Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. SIGMOD Rec. 29(2), 1–12 (2000), http://dx.doi.org/10.1145/335191.335372

    Article  Google Scholar 

  13. Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley Publi., Reading (2006)

    Google Scholar 

  14. Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)

    MATH  Google Scholar 

  15. Home page for 20 newsgroups data set (May 2009), http://people.csail.mit.edu/jrennie/20Newsgroups/

  16. Cohn, I., Gruber, A.: Information retrieval experiments (May 2009), http://www.cs.huji.ac.il/~ido_cohn

  17. Slonim, N., Tishby, N.: Document clustering using word clusters via the information bottleneck method. In: ACM SIGIR 2000, pp. 208–215. ACM Press, New York (2000)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

McAllister, R.A., Angryk, R.A. (2009). An Abstraction-Based Data Model for Information Retrieval. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_57

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-10439-8_57

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-10438-1

  • Online ISBN: 978-3-642-10439-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics