Abstract
Query expansion techniques are used to find the desired set of query terms to improve retrieval performance. One of the limitations with the query expansion techniques is that a query is often expanded only by the linguistic features of terms. This paper presents a novel semantic query expansion technique that combines association rules with ontologies and information retrieval techniques. We propose to use the association rule discovery to find good candidate terms to improve the retrieval performance. These candidate terms are automatically derived from collections and added to the original query. Our method is differentiated from others in that 1) it utilizes the semantics as well as linguistic properties of unstructured text corpus and 2) it makes use of contextual properties of important terms discovered by association rules. Experiments conducted on a subset of TREC collections give quite encouraging results. We achieve from 15.49% to 20.98% improvement in term of P@20 with TREC5 ad hoc queries.
Preview
Unable to display preview. Download preview PDF.
References
Agrawal, R., Shafer, J.C.: Parallel mining of association rules. IEEE Transactions on Knowledge and Data Engineering 8(6), 962–969 (1996)
Cohen, W.W., Singer, Y.: Simple, Fast, and Effective Rule Learner. In: Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, July 18-22, pp. 335–342 (1999)
French, J.C., Powell, A.L., Gey, F., Perelman, N.: Exploiting a Controlled Vocabulary to Improve Collection Selection and Retrieval Effectiveness. In: 10th International Conference on Information and Knowledge Management (2001)
Gonzalo, J., Verdejo, F., Chugur, I., Cigarran, J.: Indexing with WordNet synsets can improve text retrieval. In: Proceedings of the COLING/ACL Workshop on Usage of WordNet in Natural Language Processing systems, Montreal (1998)
Lam-Adesina, A.M., Jones, G.J.F.: Applying Summarization Techniques for Term Selection in Relevance Feedback. In: Proceedings of the 24th annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 1–9 (2001)
Liu, S., Liu, F., Yu, C., Meng, W.: An Effective Approach to Document Retrieval via Utilizing WordNet and Recognizing Phrases. In: Proceedings of the 27th annual international Conference on Research and development in Information Retrieval, pp. 266–272 (2004)
Latiri, C.C., Yahia, S.B., Chevallet, J.P., Jaoua, A.: Query expansion using fuzzy association rules between terms. In: JIM 2003, France, September 3-6 (2003)
Mihalcea, R., Moldovan, D.: Semantic Indexing Using WordNet Senses. In: ACL Workshop on IR & NLP (2000)
Mitra, C.U., Singhal, A., Buckely, C.: Improving Automatic Query Expansion. In: Proceedings of the 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 206–214 (1998)
Salton, G., Buckley, C., Fox, E.A.: Automatic query formulations in information retrieval. Journal of the American Society for Information Science 34(4), 262–280 (1983)
Sanderson, M.: Word sense disambiguation and information retrieval. In: Proceedings, ACM Special Interest Group on Information retrieval, pp. 142–151 (1994)
Voorhees, E.M.: Using WordNet for Text Retrieval. In: Fellbaum, C. (ed.) WordNet, an Electronic Lexical Database, pp. 285–303. MIT Press, Cambridge (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Song, M., Song, IY., Hu, X., Allen, R. (2005). Semantic Query Expansion Combining Association Rules with Ontologies and Information Retrieval Techniques. In: Tjoa, A.M., Trujillo, J. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2005. Lecture Notes in Computer Science, vol 3589. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11546849_32
Download citation
DOI: https://doi.org/10.1007/11546849_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-28558-8
Online ISBN: 978-3-540-31732-6
eBook Packages: Computer ScienceComputer Science (R0)