Skip to main content
Log in

A new method for automatic performance comparison of search engines

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

In this paper, we present a new method for automatically comparing the performance, such as precision, of search engines. Based on queries randomly selected from a specific domain of interest, the method uses robots to automatically query the target search engines, evaluates the relevance of the returned links to the query either automatically based on the vector space model or manually, and then applies statistic measures, including the probability of win and the Friedman statistic, to compare the performance of search engines. We show the experimental results of the new method on three search engines, AltaVista, Google, and InfoSeek. The method arrived at the same performance comparison result in applying either the automatic relevance evaluation method or the manual method. In addition, our results show that the probability of win is a better metric than the Friedman statistic in performance comparison. The advantage of the new method is that it is fast, flexible, consistent, and can adapt to the fast changing search engines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Brin, S. and L. Page (1998), “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” In Proc. of the 7th International World Wide Web Conference, http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm.

  • Chu, H. and M. Rosenthal (1996), “Search Engines for the World Wide Web: A Comparative Study and Evaluation Methodology,” In ASIS'96: Proc. of the 59th ASIS Annual Meeting, Medford, NJ, Information Today, Inc., pp. 127–135.

  • Clarke, S.J. and P. Willett (1997), “Estimating the Recall Performance of Web Search Engines,” Aslib Proceedings, 184–189.

  • Cohen, A. (1999), “Web Portal and Search Sites,” PC Magazine 10, 120–156.

  • Conover, W.J. (1971), Practical Nonparametric Statistics, Wiley, New York.

    Google Scholar 

  • Devore, J.L. (1982), Probability and Statistics for Engineering and the Sciences, Brooks/Cole Pub. Co., Monterey, CA.

    Google Scholar 

  • Ding, W. and G. Marchionini (1996), “A Comparative Study ofWeb Search Service Performance,” In ASIS'96: Proc. of the 59th ASIS Annual Meeting, Medford, NJ, Information Today, Inc., pp. 136–141.

    Google Scholar 

  • Dong, X. and L.T. Su (1997), “Search Engines on the World Wide Web and Information Retrieval from the Internet: A Review and Evaluation,” Online and CDROM Review 21, 2, 67–81.

    Google Scholar 

  • Gauch, S. and G. Wang (1996), “Information Fusion with Profusion,” In Webnet 96 Conference [online].

  • Gravano, L., H. Garcia-Molina, and A. Tomasic (1999), “GlOSS: Text-Source Discovery over the Internet,” ACM Transactions on Database Systems 24, 2, 229–264.

    Google Scholar 

  • Grossman, D., O. Frieder, D. Holmes, and D. Roberts (1997), “Integrating Structured Data and Text: A Relational Approach,” Journal of the American Society for Information Science 48, 2.

    Google Scholar 

  • Ieumwananonthachai, A. and B.W. Wah (1996), “Statistical Generalization of Performance-Related Heuristics for Knowledge-Lean Applications,” International Journal of Artificial Intelligence Tools 5, 1/2, 61–79.

    Google Scholar 

  • Lebedev, A. (1999), “Best Search Engines for Finding Scientific Information in the Web,” http://www.chem.msu.su/eng/comparison.html.

  • Leighton, H.V. and J. Srivastava (1999), “First 20 Precision among World Wide Web Search Services (Search Engines),” Journal of the American Society for Information Science 50, 10, 870–881.

    Google Scholar 

  • MacCall, S.L. (1998), “Relevance Reliability in Cyberspace: Toward Measurement Theory for Internet Information Retrieval,” In Proceedings of the American Society for Information Science 1998 Annual Meeting, pp. 13–22.

  • MacCall, S.L. and A.D. Cleveland (1999), “A Relevance-based Quantitative Measure for Internet Information Retrieval Evaluation,” In Proceedings of the American Society for Information Science 1999 Annual Meeting, pp. 763–768.

  • Salton, G. (1989), Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley Series in Computer Science, Addison-Wesley/Longman, Reading, MA.

    Google Scholar 

  • Siegel, S. and J.N. John Castellan (1988), Nonparametric Statistics for the Behavioral Sciences, McGraw-Hill, New York.

    Google Scholar 

  • Takakuwa, T. (2000), “Search Engines WorldWide,” http://www.twics.com/takakuwa/search/search.html.

  • Tomaiuolo, N. and J. Packer (1996), “An Analysis of Internet Search Engines: Assessment of over 200 Search Queries,” Computers in Libraries 16, 6, 58–63.

    Google Scholar 

  • Wah, B.W., A. Ieumwananonthachai, L.C. Chu, and A. Aizawa (1995), “Genetics-Based Learning of New Heuristics: Rational Scheduling of Experiments and Generalization,” IEEE Transactions on Knowledge and Data Engineering 7, 5, 763–785.

    Google Scholar 

  • Wishard, L. (1998), “Precison among Internet Search Engines: An Earth Sciences Case Study,” Issues in Science and Technology Librarianship 18.

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Li, L., Shang, Y. A new method for automatic performance comparison of search engines. World Wide Web 3, 241–247 (2000). https://doi.org/10.1023/A:1018790907285

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1018790907285

Navigation