Automatic Search Engine Performance Evaluation with the Wisdom of Crowds

Cen, Rongwei; Liu, Yiqun; Zhang, Min; Ru, Liyun; Ma, Shaoping

doi:10.1007/978-3-642-04769-5_31

Rongwei Cen²³,
Yiqun Liu²³,
Min Zhang²³,
Liyun Ru²³ &
…
Shaoping Ma²³

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5839))

Included in the following conference series:

Asia Information Retrieval Symposium

908 Accesses
7 Citations

Abstract

Relevance evaluation is an important topic in Web search engine research. Traditional evaluation methods resort to huge amount of human efforts which lead to an extremely time-consuming process in practice. With analysis on large scale user query logs and click-through data, we propose a performance evaluation method that fully automatically generates large scale Web search topics and answer sets under Cranfield framework. These query-to-answer pairs are directly utilized in relevance evaluation with several widely-adopted precision/recall-related retrieval performance metrics. Besides single search engine log analysis, we propose user behavior models on multiple search engines’ click-through logs to reduce potential bias among different search engines. Experimental results show that the evaluation results are similar to those gained by traditional human annotation, and our method avoids the propensity and subjectivity of manual judgments by experts in traditional ways.

Supported by the Chinese National Key Foundation Research & Development Plan (2004CB318108), Natural Science Foundation (60621062, 60503064, 60736044) and National 863 High Technology Project (2006AA01Z141).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

A Short Survey on Online and Offline Methods for Search Quality Evaluation

Federated Search Using Query Log Evidence

Parsisanj: an automatic component-based approach toward search engine evaluation

Article 25 January 2022

References

Agichtein, E., Brill, E., Dumais, S., Ragno, R.: Learning user interaction models for predicting web search result preferences. In: SIGIR 2006, pp. 3–10. ACM, New York (2006)
Google Scholar
Broder, A.: A taxonomy of web search. SIGIR Forum 36(2), 3–10 (2002)
Article MATH Google Scholar
Buckley, C., Dimmick, D., Soboroff, I., Voorhees, E.: Bias and the limits of pooling for large collections. Inf. Retr. 10(6), 491–508 (2007)
Article Google Scholar
Cleverdon, C., Mills, J., Keen, M.: Aslib Cranfield research project - Factors determining the performance of indexing systems; Design; Part 1, vol. 1 (1966)
Google Scholar
Craswell, M., Hawking, D.: Overview of the TREC 2003 Web track. In: Voorhees, E.M., Buckland, L.P. (eds.) NIST Special Publication 500-261: TREC 2004 (2004)
Google Scholar
Fuxman, A., Tsaparas, P., Achan, K., Agrawal, R.: Using the wisdom of the crowds for keyword generation. In: Proc. of WWW 2008, pp. 61–70. ACM, New York (2008)
Google Scholar
Hawking, D., Craswell, N.: Overview of the TREC 2003 Web track. In: Voorhees, E.M., Buckland, L.P. (eds.) NIST Special Publication 500-255: TREC 2003 (2003)
Google Scholar
Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR 2005, pp. 154–161. ACM, New York (2005)
Google Scholar
Dou, Z., Song, R., Yuan, X., Wen, J.R.: Are click-through data adequate for learning web search rankings? In: CIKM 2008, New York, NY, pp. 73–82 (2008)
Google Scholar
Kent, A., Berry, M., Leuhrs, F.U., Perry, J.W.: Machine literature searching VIII. Operational criteria for designing information retrieval systems. American Documentation 6(2), 93–101 (1955)
Google Scholar
Liu, Y., Zhang, M., Ru, L., Ma, S.: Automatic Query Type Identification Based on Click Through Information. In: Ng, H.T., Leong, M.-K., Kan, M.-Y., Ji, D. (eds.) AIRS 2006. LNCS, vol. 4182, pp. 593–600. Springer, Heidelberg (2006)
Chapter Google Scholar
Nuray, R., Can, F.: Automatic ranking of retrieval systems in imperfect environments. In: Proc. of SIGIR 2003, pp. 379–380. ACM, New York (2003)
Google Scholar
Oard, D.W., Kim, J.: Modeling information content using observable behavior. In: Proc. of ASIST 2001, Washington, D.C., USA, pp. 38–45 (2001)
Google Scholar
Rose, D.E., Levinson, D.: Understanding user goals in web search. In: Proc. of WWW 2004, pp. 13–19. ACM, New York (2004)
Google Scholar
Saracevic, T.: Evaluation of evaluation in information retrieval. In: Proc. of SIGIR 1995, pp. 138–146. ACM, New York (1995)
Google Scholar
Silverstein, C., Marais, H., Henzinger, M., Moricz, M.: Analysis of a very large web search engine query log. SIGIR Forum 33(1), 6–12 (1999)
Article Google Scholar
Soboroff, I., Nicholas, C., Cahan, P.: Ranking retrieval systems without relevance judgments. In: Proc. of SIGIR 2001, pp. 66–73. ACM, New York (2001)
Google Scholar
Soboroff, I., Voorhees, E., Craswell, N.: Summary of the SIGIR 2003 workshop on defining evaluation methodologies for terabyte-scale test collections. SIGIR Forum 37(2), 55–58 (2003)
Article Google Scholar
Voorhees, E.M.: The Philosophy of Information Retrieval Evaluation. In: Peters, C., Braschler, M., Gonzalo, J., Kluck, M. (eds.) CLEF 2001. LNCS, vol. 2406, pp. 355–370. Springer, Heidelberg (2002)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

State Key Laboratory of Intelligent Technology and Systems, Tsinghua National Laboratory for Information Science and Technology, Department of Computer Science and Technology, Tsinghua University, Beijing, China
Rongwei Cen, Yiqun Liu, Min Zhang, Liyun Ru & Shaoping Ma

Authors

Rongwei Cen
View author publications
You can also search for this author in PubMed Google Scholar
Yiqun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Min Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Liyun Ru
View author publications
You can also search for this author in PubMed Google Scholar
Shaoping Ma
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Pohang University of Science and Technology, San 31, Hyoja-dong, Nam-gu, 790-784, Pohang, Korea
Gary Geunbae Lee
School of Computing, The Robert Gordon University, St Andrew Street, AB25 1HG, Aberdeen, UK
Dawei Song
Microsoft Reseach Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Chin-Yew Lin
National Institute of Informatics, 2-1-2 Hitotsubashi, Chiyoda-ku, 101-8430, Tokyo, Japan
Akiko Aizawa
School of Literature, Shirayuri College, 1-25 Midorigaoka, Chofu-shi, 182-8525, Tokyo, Japan
Kazuko Kuriyama
Graduate School of Information Science and Technology, Hokkaido University, North 14 West 9, Kita-ku. Sapporo-shi, 060-0814, Hokkaido, Japan
Masaharu Yoshioka
Microsoft Research Asia, 5F Beijing Sigma Center, 49 Zhichun Road, Haidian District, 100190, Beijing, P.R. China
Tetsuya Sakai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cen, R., Liu, Y., Zhang, M., Ru, L., Ma, S. (2009). Automatic Search Engine Performance Evaluation with the Wisdom of Crowds. In: Lee, G.G., et al. Information Retrieval Technology. AIRS 2009. Lecture Notes in Computer Science, vol 5839. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04769-5_31

Download citation

DOI: https://doi.org/10.1007/978-3-642-04769-5_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04768-8
Online ISBN: 978-3-642-04769-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Automatic Search Engine Performance Evaluation with the Wisdom of Crowds

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Short Survey on Online and Offline Methods for Search Quality Evaluation

Federated Search Using Query Log Evidence

Parsisanj: an automatic component-based approach toward search engine evaluation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Automatic Search Engine Performance Evaluation with the Wisdom of Crowds

Abstract

Access this chapter

Preview

Similar content being viewed by others

A Short Survey on Online and Offline Methods for Search Quality Evaluation

Federated Search Using Query Log Evidence

Parsisanj: an automatic component-based approach toward search engine evaluation

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation