An Abstraction-Based Data Model for Information Retrieval

McAllister, Richard A.; Angryk, Rafal A.

doi:10.1007/978-3-642-10439-8_57

Richard A. McAllister²¹ &
Rafal A. Angryk²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5866))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

1585 Accesses
1 Citations

Abstract

Language ontologies provide an avenue for automated lexical analysis that may be used to supplement existing information retrieval methods. This paper presents a method of information retrieval that takes advantage of WordNet, a lexical database, to generate paths of abstraction, and uses them as the basis for an inverted index structure to be used in the retrieval of documents from an indexed corpus. We present this method as a entree to a line of research on using ontologies to perform word-sense disambiguation and improve the precision of existing information retrieval techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Sussna, M.: Word sense disambiguation for free-text indexing using a massive semantic network. In: Proceedings of the 2nd International Conference on Information and Knowledge Management, pp. 67–74 (1993)
Google Scholar
Dictionary.com, http://www.dictionary.com
Jiang, J.J., Conrath, D.W.: Semantic similarity based on corpus statistics and lexical taxonomy. In: Proceedings of International Conference Research on Computational Linguistics (1997)
Google Scholar
Pedersen, T., Banerjee, S., Padwardhan, S.: Maximizing semantic relatedness to perform word sense disambiguation (February 2009), citeseer.ist.psu.edu/pedersen03maximizing.html
Wan, S., Angryk, R.: Measuring semantic similarity using wordnet-based context vectors. In: Proceedings of the IEEE International Conference on Systems, Man & Cybernetics (2007)
Google Scholar
Resnik, P.: Semantic similarity in a taxonomy: An information-based measure and its application to problems of ambiguity in natural language. Journal of Artificial Intelligence Research 11, 95–130 (1999)
MATH Google Scholar
Lin, D.: An information-theoretic definition of similarity. In: Proceedings of the 15th International Conference on Machine Learning, pp. 296–304 (1998)
Google Scholar
Widdows, D., Dorow, B.: A graph model for unsupervised lexical acquisition. In: 19th International conference on Computational Linguistics, pp. 1093–1099 (2002)
Google Scholar
Feldman, R., Sanger, J.: The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data. Cambridge University Press, Cambridge (2006)
Google Scholar
Wordnet: a lexical database for the english language, http://wordnet.princeton.edu/
Hossain, M.S., Angryk, R.A.: Gdclust: A graph-based document clustering technique. In: ICDM Workshops, pp. 417–422. IEEE Computer Society, Los Alamitos (2007), http://dblp.uni-trier.de/db/conf/icdm/icdmw2007.html#HossainA07
Google Scholar
Han, J., Pei, J., Yin, Y.: Mining frequent patterns without candidate generation: a frequent-pattern tree approach. SIGMOD Rec. 29(2), 1–12 (2000), http://dx.doi.org/10.1145/335191.335372
Article Google Scholar
Tan, P.-N., Steinbach, M., Kumar, V.: Introduction to Data Mining. Addison-Wesley Publi., Reading (2006)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
MATH Google Scholar
Home page for 20 newsgroups data set (May 2009), http://people.csail.mit.edu/jrennie/20Newsgroups/
Cohn, I., Gruber, A.: Information retrieval experiments (May 2009), http://www.cs.huji.ac.il/~ido_cohn
Slonim, N., Tishby, N.: Document clustering using word clusters via the information bottleneck method. In: ACM SIGIR 2000, pp. 208–215. ACM Press, New York (2000)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Montana State University Department of Computer Science, Bozeman, MT, 59717-3880
Richard A. McAllister & Rafal A. Angryk

Authors

Richard A. McAllister
View author publications
You can also search for this author in PubMed Google Scholar
Rafal A. Angryk
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Clayton School of Information Technology, Monash University, 3800, Clayton, VIC, Australia
Ann Nicholson
School of Computer Science and Information Technology, RMIT University, 3001, Melbourne, VIC, Australia
Xiaodong Li

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

McAllister, R.A., Angryk, R.A. (2009). An Abstraction-Based Data Model for Information Retrieval. In: Nicholson, A., Li, X. (eds) AI 2009: Advances in Artificial Intelligence. AI 2009. Lecture Notes in Computer Science(), vol 5866. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-10439-8_57

Download citation

DOI: https://doi.org/10.1007/978-3-642-10439-8_57
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-10438-1
Online ISBN: 978-3-642-10439-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics