Abstract
The paper reports the development of Dipe-D, a knowledge-based procedure for the formulation of Boolean queries in information retrieval. Dipe-D creates a query in two steps: (1) the user's information need is developed interactively, while identifying the concepts of the information need, and subsequently (2) the collection of concepts identified is automatically transformed into a Boolean query. In the first step, the subject area—as represented in a knowledge base—is explored by the user. He does this by means of specifying the (concepts that meet his) information need in an artificial language and looking through the solution as provided by the computer. The specification language allows one to specify concepts by their features, both in precise terms as well as vaguely. By repeating the process of specifying the information need and exploring the resulting concepts, the user may precisely single out the concepts that describe his information need. In the second step, the program provides the designations (and variants) for the concepts identified, and connects them by appropriate operators. Dipe-D is meant to improve on existing procedures that identify the concepts less systematically, create a query manually, and then sometimes expand that query. Experiments are reported on each of the two steps; they indicate that the first step identifies only but not all the relevant concepts, and the second step performs (at least) as good as human beings do.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Aboud M, Chrisment C, Razouk R, Sedes F and Soule-Dupuy C (1993) Querying a hypertext information retrieval system by the use of classification. Information Processing & Management, 29:387-396.
Aboud M, Razouk R, Sedes F and Soule-Dupuy C (1991) Classification and information retrieval in hypertext systems. In: Lichnerowicz A, ed., Proc. of a Conference on Intelligent Text and Image Handling "RIAO'91", Elsevier, Amsterdam, The Netherlands, pp. 103-118.
Ballesteros L and Croft WB (1997) Phrasal translation and query expansion techniques for cross-language information retrieval. In: Belkin N, Narasimhalu AD and Wilett P, eds., Proc. of the 20th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval.ACM, NewYork,NY, pp. 84-91.
Blair DC and Maron ME (1985) An evaluation of retrieval effectiveness for a full-text document-retrieval system. Communications of the ACM, 28:289-299.
Buckley C, Salton G, Allan J and Singhal A (1995) Automatic query expansion using SMART: TREC 3. In: Overview of the Third Text Retrieval Conference (TREC-3), pp. 69-80.
Buckley C, Salton G and Yu GT (1982) An evaluation of term dependence models in information retrieval. Research and Development in Information Retrieval. (Lecture notes in Computer Science. Vol. 146.) Springer, Berlin, Germany
Callan JP, Croft WB and Harding SM (1992) The INQUERY retrieval system. In: Tjoa AM and Ramos I, eds., Proc. of the Third International Conference on Database and Expert Systems Applications, pp. 78-83.
Chong A (1989) Topic: A concept-base document retrieval system. Library Software Review, 8:281-284.
Croft WB and Thompson RH (1987) I3R: A new approach to the design of document retrieval systems. Journal of the American Society for Information Science, 38:389-404.
Fowler RH, Fowler WA and Williams JL (1998) Document Explorer Visualizations ofWWWDocument and Term Spaces. Report Dept. of Computer Science, Univ. of Texas, TX.
Hemmje M (1994) A 3D Based User Interface for Information Retrieval Systems. Lecture Notes in Computer Science, 871:194-209.
Hull DA (1997) Using Structured Queries for Disambiguation in Cross-Language Information Retrieval. AAAI Spring Symposium on Cross-Language Text and Speech Retrieval Electronic Working Notes.
ISO 2788 (1986) ISO 2788, Documentation-Guidelines for the Establishment and Development of Monolingual Thesauri. Geneva: International Organization for Standardization.
Jansen BJ, Spink A, Bateman J and Saracevic T (1998) Real life information retrieval: A study of user queries on the web. SIGIR Forum: A Publication of the ACM Special Interest Group on Information Retrieval, 32:5-17.
Järvelin K, Kristensen J, Niemi T, Sormunen E and Keskustalo H (1996) A deductive data model for query expansion. In: Frei H-P, Harman D, Sch¨auble P and Wilkinson R, eds., Proc. of the 19th ACM Annual International ACM-SIGIR Conference. ACM, New York, NY, pp. 235-249.
Jones S, Gatford M, Robertson S, Hancock-Beaulieu M, Secker J and Walker S (1995) Interactive thesaurus navigation: Intelligence rules OK? Journal of the American Society for Information Science, 46:52-59.
Kekäläinen J (1999) The effects of query complexity, expansion and structure on retrieval performance in probabilistic text retrieval. Ph.D. Thesis. Univ. of Tampere, Tampere, Finland.
Kekäläinen J and Järvelin K(1998) The impact of query structure and query expansion on retrieval performance. In: Croft WB, Moffat A, van Rijsbergen CJ, Wilkinson R, and Zobel J, eds., Proc. of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, pp. 130-137.
Kristensen J (1993) Expanding end-user's query statements for free text searching with a search-aid thesaurus. Information Processing & Management, 29:733-744.
Matthijssen L (1999) Interfacing between lawyers and computers: An architecture for knowledge-based interfaces to legal databases. Ph.D. Thesis. Katholieke Universiteit Brabant, Tilburg, The Netherlands.
McCune BP, Tong RM, Dean JS, and Shapiro DG (1985) RUBRIC: A System for Rule-Based Information Retrieval. IEEE Transactions on Software Engineering, 11:939-945.
McMath CF, Tamaru RS and Rada R(1989)A graphical thesaurus-based information retrieval system. International Journal of Man-Machine Studies, 31:121-147.
Miller GA (1995) WordNet: A Lexical Database for English. Communications of the ACM, 38:39-41.
Mitra M, Singhal A and Buckley C (1998) Improving automatic query expansion. In: Croft WB, Moffat A, van Rijsbergen CJ and Zobel J, eds., Proc. of the 21st Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, pp. 206-214.
Rocchio JJ, Jr (1968) Relevance feedback in information retrieval. In: Salton G ed., The SMART Retrieval System-Experiments in Automatic Document Processing. Prentice-Hall, Englewood Cliffs, N.J.
Rosengren P (1994) ER-based information retrieval in a mixed database. Lecture Notes in Computer Science, 823:426-437.
Salton G and Buckley C (1990) Improving retrieval performance by relevance feedback. Journal of the Americal Society for Information Science, 41(4):288-297.
Salton G, Fox EA and Voorhees E (1985) Advanced feedback methods in information retrieval. Journal of the American Society for Information Science, 36(3):200-210.
Salton G and McGill MJ (1983) Introduction to Modern Information Retrieval. McGraw-Hill, New York, NY.
Smeaton AF and van Rijsbergen CJ (1983) The retrieval effects of query expansion on a feedback document retrieval system. Computer Journal, 26:239-246.
Spink A (1994) Term relevance and query expansion: Relation to design. In: Croft WB and van Rijsbergen CJ, eds., Proc. of the 17th Annual International ACM-Conference on Research and Development in Information Retrieval. ACM, New York, NY, pp. 81-90.
Spink A and Saracevic T (1997) Interaction in information retrieval: Selection and effectiveness of search terms. Journal of the American Society for Information Science, 48(8):741-761.
UMLS (1994) UMLS Knowledge Sources. 5th Experimental Edition. National Library of Medicine, Bethesda, MD.
van der Pol RW (2000) Knowledge-Based Query Formulation in Information Retrieval. Ph.D. Thesis Maastricht University. Phidippides, Cadier en Keer, The Netherlands.
van der Pol RW (2002) Dipe-R: A knowledge representation language for formulating queries in Information Retrieval. To appear in Data and Knowledge Engineering.
van Rijsbergen CJ (1979) Information Retrieval. 2nd edition. Butterworth & Co. Ltd., London, UK. VIP (1994). Opmerkelijke tijdwinst door snelle retrieval met Trip.
VIP: Vakblad voor image processing. 6:8-10. (In Dutch, translated title: Remarkable gain of time from fast retrieval by Trip.)
Voorhees E (1994) Query expansion using lexical-semantic relations. In: Croft WB and van Rijsbergen CJ, eds., Proc. of the 17th Annual International ACM-SIGIR Conference on Research and Development in Information Retrieval. ACM, New York, NY, pp. 61-69.
Wiesman FJ (1998) Information Retrieval by Graphically Browsing Meta-Information. Ph.D. Thesis Maastricht University. Phidippides, Cadier en Keer, The Netherlands.
Wiesman F, Hasman A and van der Meulen M (1994) Uniform access to multiple sources of information. In: Barahona P, Veloso M and Bryant J, eds., Proc. of MIE'94. Lisbon, Portugal, pp. 417-421.
Yokoi T (1995) The EDR Electronic Dictionary. Communications of the ACM, 38:42-44.
Zavrel J (1996) Neural Navigation Interfaces for Information Retrieval: Are they more than an appealing idea? Artificial Intelligence Review, 10:477-504.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
van der Pol, R. Dipe-D: A Tool for Knowledge-Based Query Formulation in Information Retrieval. Information Retrieval 6, 21–47 (2003). https://doi.org/10.1023/A:1022944313947
Issue Date:
DOI: https://doi.org/10.1023/A:1022944313947