Abstract
In an information retrieval system, a thesaurus can be used for query expansion, i.e. adding words to queries in order to improve recall. We propose a semi-automatic and interactive approach for the creation and maintenance of domain-specific thesauri for query expansion. Domain-specific thesauri are especially required in highly technical domains where the use of general thesauri for query expansion introduces more noise than useful results. Our semi-automatic approach to thesaurus creation constitutes a good compromise between fully manual approaches, which produce high-quality thesauri but at a prohibitively high cost, and fully automatic approaches, which are cheap but produce thesauri of limited quality. This article describes our approach and the architecture of the system implementing it, named Cannelle. It exploits user query logs and natural language processing to identify valuable synonymy candidates, and allows editors to interactively explore and validate these candidates in the context of a domain-specific searchable knowledge base. We evaluated the system in the domain of online troubleshooting, where the proposed method yielded an improvement in the quality of the search results obtained.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fellbaum, C. (ed.): WordNet: An Electronic Lexical Database (Language, Speech, and Communication). The MIT Press, Cambridge (1998)
Voorhees, E.M.: Query expansion using lexical-semantic relations. In: SIGIR 1994: 17th ACM International Conference on Research and Development in Information Retrieval, pp. 61–69. Springer, New York (1994)
Jacquemin, B., Brun, C., Roux, C.: Enriching a text by semantic disambiguation for information extraction. In: LREC 2002: 3rd International Conference on Language Resources and Evaluation, pp. 45–51 (2002)
Deerwester, S., Dumais, S.T., Furnas, G.W., Landauer, T.K., Harshman, R.: Indexing by latent semantic analysis. Journal of the American Society for Information Science 41, 391–407 (1990)
Qiu, Y., Frei, H.P.: Concept based query expansion. In: SIGIR 1993: 16th ACM International Conference on Research and Development in Information Retrieval, pp. 160–169. ACM, New York (1993)
Turney, P.D.: Mining the web for synonyms: PMI-IR versus LSA on TOEFL. In: Flach, P.A., De Raedt, L. (eds.) ECML 2001. LNCS, vol. 2167, pp. 491–502. Springer, Heidelberg (2001)
Amitay, E., Darlow, A., Konopnicki, D., Weiss, U.: Queries as anchors: selection by association. In: HYPERTEXT 2005: Sixteenth ACM Conference on Hypertext and Hypermedia, pp. 193–201. ACM Press, New York (2005)
Baroni, M., Bisi, S.: Using cooccurence statistics and the web to discover synonyms in a technical language. In: LREC 2004: 4th International Conference on Language Resources and Evaluation (2004)
Cucerzan, S., Brill, E.: Extracting semantically related queries by exploiting user session information. Technical report, Microsoft Research (2005)
Jones, R., Rey, B., Madani, O., Greiner, W.: Generating query substitutions. In: WWW 2006: 15th International Conference on World Wide Web, pp. 387–396. ACM, New York (2006)
Cui, H., Wen, J.R., Nie, J.Y.: Query expansion by mining user logs. IEEE Trans. on Knowl. and Data Eng. 15, 829–839 (2003) (Member-Wei-Ying Ma)
Fonseca, B.M., Golgher, P., Pôssas, B., Ribeiro-Neto, B., Ziviani, N.: Concept-based interactive query expansion. In: CIKM 2005: 14th ACM International Conference on Information and Knowledge Management, pp. 696–703. ACM, New York (2005)
Roulland, F., Kaplan, A., Castellani, S., Grasso, A., Roux, C., O’Neill, J., Pettersson, K.: Query reformulation and refinement using nlp-based sentence clustering. In: Amati, G., Carpineto, C., Romano, G. (eds.) ECIR 2007. LNCS, vol. 4425, pp. 210–221. Springer, Heidelberg (2007)
Shiri, A.A., Revie, C., Chowdhury, G.: Thesaurus-assisted search term selection and query expansion: a review of user-centred studies. Knowledge Organization 29, 1–19 (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Castellani, S., Kaplan, A., Roulland, F., Willamowski, J., Grasso, A. (2009). Creation and Maintenance of Query Expansion Rules. In: Filipe, J., Cordeiro, J. (eds) Enterprise Information Systems. ICEIS 2009. Lecture Notes in Business Information Processing, vol 24. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01347-8_68
Download citation
DOI: https://doi.org/10.1007/978-3-642-01347-8_68
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-01346-1
Online ISBN: 978-3-642-01347-8
eBook Packages: Computer ScienceComputer Science (R0)