Abstract
The article presents an algorithm for retrieving textual information in documents collection. The algorithm employs a category system that organizes the repository and using interaction with the user improves search precision. The algorithm was implemented for simple English Wikipedia and the first evaluation results indicates the proposed method can help to retrieve information from large document repositories.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baeza-Yates, R., Ribeiro-Neto, B., et al.: Modern information retrieval, vol. 463. ACM press, New York (1999)
Castells, P., Fernandez, M., Vallet, D.: An adaptation of the vector-space model for ontology-based information retrieval. IEEE Transactions on Knowledge and Data Engineering 19, 261–272 (2007)
Kilicoglu, H., Fiszman, M., Rodriguez, A., Shin, D., Ripple, A., Rindflesch, T.: Semantic medline: A web application for managing the results of pubmed searches. In: Proceedings of the Third International Symposium for Semantic Mining in Biomedicine, pp. 69–76 (2008)
Raghavan, V., Bollmann, P., Jung, G.: A critical investigation of recall and precision as measures of retrieval system performance. ACM Transactions on Information Systems (TOIS) 7, 205–229 (1989)
Miller, G.A., Beckitch, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An On-line Lexical Database. Cognitive Science Laboratory, Princeton University Press (1993)
Jiang, J., Conrath, D.: Semantic similarity based on corpus statistics and lexical taxonomy. Arxiv preprint cmp-lg/9709008 (1997)
Deerwester, S., Dumais, S., Furnas, G., Landauer, T., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Xu, J., Croft, W.: Query expansion using local and global document analysis. In: Proceedings of the 19th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 4–11. ACM (1996)
Majewski, P., Szymański, J.: Text Categorization with Semantic Commonsense Knowledge: First Results. In: Ishikawa, M., Doya, K., Miyamoto, H., Yamakawa, T. (eds.) ICONIP 2007, Part II. LNCS, vol. 4985, pp. 769–778. Springer, Heidelberg (2008)
Ceglowski, M., Coburn, A., Cuadrado, J.: Semantic search of unstructured data using contextual network graphs. National Institute for Technology and Liberal Education 10 (2003)
Cui-ru, W., Chun-hong, D.: An Improved Density-based DBSCAN Clustering Algorithm. Journal of Guangxi Normal University (Natural Science Edition) 4 (2007)
Rosell, M.: Introduction to text clustering (2008)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, New York (2008)
Szymanski, J., Duch, W.: Information Retrieval with Semantic Memory Model. Cognitive Systems Research (2011)
Voorhees, E.: Using WordNet to disambiguate word senses for text retrieval. In: Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 171–180. ACM, New York (1993)
Szymański, J., Mizgier, A., Szopiński, M., Lubomski, P.: Ujednoznacznianie słow przy użyciu słownika WordNet. Wydawnictwo Naukowe PG TI 2008 18, 89–195 (2008)
Duch, W., Matykiewicz, P., Pestian, J.: Neurolinguistic approach to natural language processing with applications to medical text analysis. Neural Networks 21(10), 1500–1510 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Szymański, J. (2012). Interactive Information Retrieval Algorithm for Wikipedia Articles. In: Yin, H., Costa, J.A.F., Barreto, G. (eds) Intelligent Data Engineering and Automated Learning - IDEAL 2012. IDEAL 2012. Lecture Notes in Computer Science, vol 7435. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32639-4_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-32639-4_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32638-7
Online ISBN: 978-3-642-32639-4
eBook Packages: Computer ScienceComputer Science (R0)