research-article

A few examples go a long way: constructing query models from elaborate query formulations

Authors:
Krisztian Balog

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Wouter Weerkamp

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

,
Maarten de Rijke

University of Amsterdam, Amsterdam, Netherlands

University of Amsterdam, Amsterdam, Netherlands
View Profile

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrievalJuly 2008Pages 371–378https://doi.org/10.1145/1390334.1390399

Published:20 July 2008Publication History

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Pages 371–378

ABSTRACT

We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task. Our approach is based on a language modeling framework, where the query model is modified to resemble the example pages. We compare several methods for sampling expansion terms from the example pages to support query-dependent and query-independent query expansion; the latter is motivated by the wish to increase "aspect recall", and attempts to uncover aspects of the information need not captured by the query.

For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.

References

P. Bailey, D. Agrawal, and A. Kumar. TREC 2007 Enterprise Track at CSIRO. In TREC 2007 Working Notes, 2007.Google Scholar
P. Bailey, N. Craswell, A. P. De Vries, and I. Soboroff. Overview of the TREC 2007 Enterprise Track. In TREC 2007 Working Notes, 2007.Google Scholar
P. Bailey, N. Craswell, N. Soboroff, and A. de Vries. The CSIRO enterprise search test collection. ACM SIGIR Forum, 41, 2007. Google ScholarDigital Library
K. Balog, K. Hofmann, W. Weerkamp, and M. de Rijke. The University of Amsterdam at the TREC 2007 Enterprise Track. In TREC 2007 Working Notes, 2007.Google Scholar
C. Buckley. Why current IR engines fail. In SIGIR '04, pages 584--585, 2004. Google ScholarDigital Library
Y. Fu, Y. Xue, T. Zhu, Y. Liu, M. Zhang, and S. Ma. THUIR at TREC 2007: Enterprise Track. In TREC 2007 Working Notes, 2007.Google Scholar
D. Hannah, C. Macdonald, J. Peng, B. He, and I. Ounis. University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier. In TREC 2007 Working Notes, 2007.Google Scholar
D. Harman and C. Buckley. The NRRC reliable information access (RIA) workshop. In SIGIR '04, pages 528--529, 2004. Google ScholarDigital Library
D. Hiemstra. Using Language Models for Information Retrieval. PhD thesis, University of Twente, 2001.Google Scholar
D. Hiemstra, S. Robertson, and H. Zaragoza. Parsimonious language models for information retrieval. In SIGIR '04, pages 178--185, 2004. Google ScholarDigital Library
H. Joshi, S. D. Sudarsan, S. Duttachowdhury, C. Zhang, and S. Ramasway. UALR at TREC-ENT 2007. In TREC 2007 Working Notes, 2007.Google Scholar
O. Kurland, L. Lee, and C. Domshlak. Better than the real thing? In SIGIR '05, pages 19--26, 2005. Google ScholarDigital Library
J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In Language Modeling for Information Retrieval. Springer, 2003.Google ScholarCross Ref
J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, 2001. Google ScholarDigital Library
V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01, pages 120--127, 2001. Google ScholarDigital Library
D. Miller, T. Leek, and R. Schwartz. A hidden Markov model information retrieval system. In SIGIR '99, pages 214--221, 1999. Google ScholarDigital Library
J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998. Google ScholarDigital Library
Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR '93, pages 160--169, 1993. Google ScholarDigital Library
J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, 1971.Google Scholar
G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986. Google ScholarDigital Library
H. Shen, G. Chen, H. Chen, Y. Liu, and X. Cheng. Research on Enterprise Track of TREC 2007. In TREC 2007 Working Notes, 2007.Google Scholar
F. Song and W. B. Croft. A general language model for information retrieval. In CIKM '99, pages 316--321, 1999. Google ScholarDigital Library
K. Sparck Jones, S. E. Robertson, D. Hiemstra, and H. Zaragoza. Language modelling and relevance. InW. B. Croft and J. Lafferty, editors, Language Modeling for Information Retrieval, pages 57--71. 2003.Google ScholarCross Ref
T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR '06, pages 162--169, 2006. Google ScholarDigital Library
R. Yan and A. Hauptmann. Query expansion using probabilistic local feedback with application to multimedia retrieval. In CIKM '07, pages 361--370, 2007. Google ScholarDigital Library
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM '01, pages 403--410. ACM, 2001. Google ScholarDigital Library

Index Terms

A few examples go a long way: constructing query models from elaborate query formulations
1. Information systems

Recommendations

Context-aware query expansion method using Language Models and Latent Semantic Analyses

One of the key difficulties for users in information retrieval is to formulate appropriate queries to submit to the search engine. In this paper, we propose an approach to enrich the user's queries by additional context. We used the Language Model to ...
Read More
Query modeling for entity search based on terms, categories, and examples

Users often search for entities instead of documents, and in this setting, are willing to provide extra input, in addition to a series of query terms, such as category information and example entities. We propose a general probabilistic framework for ...
Read More
Improving Web Page Retrieval Using Search Context from Clicked Domain Names
DEXA '09: Proceedings of the 2009 20th International Workshop on Database and Expert Systems Application

Search context is a crucial factor that helps to understand a user’s information need in ad-hoc Web page retrieval. A query log of a search engine contains rich information on issued queries and their corresponding clicked Web pages. The clicked data ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
July 2008
934 pages
ISBN:9781605581644
DOI:10.1145/1390334
General Chairs:
Tat-Seng Chua
National University of Singapore
,
Mun-Kew Leong
National Library Board, Singapore
,
Program Chairs:
Syung Hyon Myaeng
Information and Communications University, Korea
,
Douglas W. Oard
University of Maryland, College Park, USA
,
Fabrizio Sebastiani
Consiglio Nazionale delle Ricerche, Italy
Copyright © 2008 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 20 July 2008
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
enterprise search
language models
query expansion
query modeling
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 20
  Total Citations
  View Citations
- 599
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A few examples go a long way: constructing query models from elaborate query formulations

SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

ABSTRACT

References

Cited By

Index Terms

Recommendations

Context-aware query expansion method using Language Models and Latent Semantic Analyses

Query modeling for entity search based on terms, categories, and examples

Improving Web Page Retrieval Using Search Context from Clicked Domain Names