skip to main content
10.1145/1062745.1062926acmconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
Article

A framework for determining necessary query set sizes to evaluate web search effectiveness

Published: 10 May 2005 Publication History

Abstract

We describe a framework of bootstrapped hypothesis testing for estimating the confidence in one web search engine outperforming another over any randomly sampled query set of a given size. To validate this framework, we have constructed and made available a precision-oriented test collection consisting of manual binary relevance judgments for each of the top ten results of ten web search engines across 896 queries and the single best result for each of those queries. Results from this bootstrapping approach over typical query set sizes indicate that examining repeated statistical tests is imperative, as a single test is quite likely to find significant differences that do not necessarily generalize. We also find that the number of queries needed for a repeatable evaluation in a dynamic environment such as the web is much higher than previously studied.

References

[1]
Beitzel, S., et al. Hourly Analysis of a Very Large Topically Categorized Web Query Log. in SIGIR. 2004.
[2]
Buckley, C. and E. Voorhees. Evaluating Evaluation Measure Stability. in SIGIR. 2000.
[3]
Hawking, D. and N. Craswell, Measuring Search Engine Quality. Information Retrieval, 2001. 4(1).
[4]
Efron, B. and R.J. Tibshirani, An Introduction to the Bootstrap. Monographs on Statistics and Applied Probability. 1993.

Cited By

View all
  • (2015)Incremental Sampling of Query LogsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2776780(1093-1096)Online publication date: 9-Aug-2015
  • (2010)Web search solved?Proceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871507(529-538)Online publication date: 26-Oct-2010
  • (2007)Repeatable evaluation of search services in dynamic environmentsACM Transactions on Information Systems10.1145/1292591.129259226:1(1-es)Online publication date: 1-Nov-2007
  • Show More Cited By

Index Terms

  1. A framework for determining necessary query set sizes to evaluate web search effectiveness

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      WWW '05: Special interest tracks and posters of the 14th international conference on World Wide Web
      May 2005
      454 pages
      ISBN:1595930515
      DOI:10.1145/1062745
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 10 May 2005

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Article

      Acceptance Rates

      Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 17 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2015)Incremental Sampling of Query LogsProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2776780(1093-1096)Online publication date: 9-Aug-2015
      • (2010)Web search solved?Proceedings of the 19th ACM international conference on Information and knowledge management10.1145/1871437.1871507(529-538)Online publication date: 26-Oct-2010
      • (2007)Repeatable evaluation of search services in dynamic environmentsACM Transactions on Information Systems10.1145/1292591.129259226:1(1-es)Online publication date: 1-Nov-2007
      • (2006)A picture of searchProceedings of the 1st international conference on Scalable information systems10.1145/1146847.1146848(1-es)Online publication date: 30-May-2006
      • (2005)Predicting query difficulty on the web by learning visual cluesProceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1076034.1076155(615-616)Online publication date: 15-Aug-2005
      • (2005)Surrogate scoring for improved metasearch precisionProceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval10.1145/1076034.1076139(583-584)Online publication date: 15-Aug-2005

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media