Article

A framework for determining necessary query set sizes to evaluate web search effectiveness

Authors:

Eric C. Jensen,

Steven M. Beitzel,

Ophir Frieder,

Abdur ChowdhuryAuthors Info & Claims

WWW '05: Special interest tracks and posters of the 14th international conference on World Wide Web

Pages 1176 - 1177

https://doi.org/10.1145/1062745.1062926

Published: 10 May 2005 Publication History

Get Access

Abstract

We describe a framework of bootstrapped hypothesis testing for estimating the confidence in one web search engine outperforming another over any randomly sampled query set of a given size. To validate this framework, we have constructed and made available a precision-oriented test collection consisting of manual binary relevance judgments for each of the top ten results of ten web search engines across 896 queries and the single best result for each of those queries. Results from this bootstrapping approach over typical query set sizes indicate that examining repeated statistical tests is imperative, as a single test is quite likely to find significant differences that do not necessarily generalize. We also find that the number of queries needed for a repeatable evaluation in a dynamic environment such as the web is much higher than previously studied.

References

[1]

Beitzel, S., et al. Hourly Analysis of a Very Large Topically Categorized Web Query Log. in SIGIR. 2004.

Digital Library

Google Scholar

[2]

Buckley, C. and E. Voorhees. Evaluating Evaluation Measure Stability. in SIGIR. 2000.

Digital Library

Google Scholar

[3]

Hawking, D. and N. Craswell, Measuring Search Engine Quality. Information Retrieval, 2001. 4(1).

Digital Library

Google Scholar

[4]

Efron, B. and R.J. Tibshirani, An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability. 1993.

Google Scholar

Cited By

View all

Baeza-Yates RBaeza-Yates RLalmas MMoffat ARibeiro-Neto B(2015)Incremental Sampling of Query LogsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2776780(1093-1096)Online publication date: 9-Aug-2015
https://dl.acm.org/doi/10.1145/2766462.2776780
Zaragoza HCambazoglu BBaeza-Yates RHuang JKoudas NJones GWu XCollins-Thompson KAn A(2010)Web search solved?Proceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871507(529-538)Online publication date: 26-Oct-2010
https://dl.acm.org/doi/10.1145/1871437.1871507
Jensen EBeitzel SChowdhury AFrieder O(2007)Repeatable evaluation of search services in dynamic environmentsACM Transactions on Information Systems10.1145/1292591.129259226:1(1-es)Online publication date: 1-Nov-2007
https://dl.acm.org/doi/10.1145/1292591.1292592
Show More Cited By

Index Terms

A framework for determining necessary query set sizes to evaluate web search effectiveness
1. Information systems
  1. World Wide Web
    1. Web applications
    2. Web services

Recommendations

Evaluating the retrieval effectiveness of web search engines using a representative query sample

Search engine retrieval effectiveness studies are usually small scale, using only limited query samples. Furthermore, queries are selected by the researchers. We address these issues by taking a random representative sample of 1,000 informational and 1,...
Re-ranking search results using query logs
CIKM '06: Proceedings of the 15th ACM international conference on Information and knowledge management

This work addresses two common problems in search, frequently occurring with underspecified user queries: the top-ranked results for such queries may not contain documents relevant to the user's search intent, and fresh and relevant pages may not get ...
Mining Web search engines for query suggestion

Queries to Web search engines are usually short and ambiguous, which provides insufficient information needs of users for effectively retrieving relevant Web pages. To address this problem, query suggestion is implemented by most search engines. However,...

Comments

Information & Contributors

Information

Published In

WWW '05: Special interest tracks and posters of the 14th international conference on World Wide Web

May 2005

454 pages

ISBN:1595930515

DOI:10.1145/1062745

Conference Chairs:
Allan Ellis
Southern Cross University, Australia
,
Tatsuya Hagino
Keio University, Japan
,
Program Chairs:
Fred Douglis
IBM Research
,
Prabhakar Raghavan
Verity, Inc.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 May 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
236
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Baeza-Yates RBaeza-Yates RLalmas MMoffat ARibeiro-Neto B(2015)Incremental Sampling of Query LogsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2776780(1093-1096)Online publication date: 9-Aug-2015
https://dl.acm.org/doi/10.1145/2766462.2776780
Zaragoza HCambazoglu BBaeza-Yates RHuang JKoudas NJones GWu XCollins-Thompson KAn A(2010)Web search solved?Proceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871507(529-538)Online publication date: 26-Oct-2010
https://dl.acm.org/doi/10.1145/1871437.1871507
Jensen EBeitzel SChowdhury AFrieder O(2007)Repeatable evaluation of search services in dynamic environmentsACM Transactions on Information Systems10.1145/1292591.129259226:1(1-es)Online publication date: 1-Nov-2007
https://dl.acm.org/doi/10.1145/1292591.1292592
Pass GChowdhury ATorgeson CJia X(2006)A picture of searchProceedings of the 1st international conference on Scalable information systems10.1145/1146847.1146848(1-es)Online publication date: 30-May-2006
https://dl.acm.org/doi/10.1145/1146847.1146848
Jensen EBeitzel SGrossman DFrieder OChowdhury ABaeza-Yates RZiviani NMarchionini GMoffat ATait J(2005)Predicting query difficulty on the web by learning visual cluesProceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1076034.1076155(615-616)Online publication date: 15-Aug-2005
https://dl.acm.org/doi/10.1145/1076034.1076155
Beitzel SJensen EFrieder OChowdhury APass GBaeza-Yates RZiviani NMarchionini GMoffat ATait J(2005)Surrogate scoring for improved metasearch precisionProceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1076034.1076139(583-584)Online publication date: 15-Aug-2005
https://dl.acm.org/doi/10.1145/1076034.1076139

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Index Terms

Recommendations

Evaluating the retrieval effectiveness of web search engines using a representative query sample

Re-ranking search results using query logs

Mining Web search engines for query suggestion