Skip to main content

Adapting Document Ranking to Users’ Preferences Using Click-Through Data

  • Conference paper
Information Retrieval Technology (AIRS 2006)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 4182))

Included in the following conference series:

Abstract

This paper proposes a new approach to ranking the documents retrieved by a search engine using click-through data. The goal is to make the final ranked list of documents accurately represent users’ preferences reflected in the click-through data. Our approach combines the ranking result of a traditional IR algorithm (BM25) with that given by a machine learning algorithm (Naïve Bayes). The machine learning algorithm is trained on click-through data (queries and their associated documents), while the IR algorithm runs over the document collection. We consider several alternative strategies for combining the result of using click-through data and that of using document data. Experimental results confirm that any method of using click-through data greatly improves the preference ranking, over the method of using BM25 alone. We found that a linear combination of scores of Naïve Bayes and scores of BM25 performs the best for the task. At the same time, we found that the preference ranking methods can preserve relevance ranking, i.e., the preference ranking methods can perform as well as BM25 for relevance ranking.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anick, P.: Using terminological feedback for web search refinement – a log-based study. In: Proceedings of SIGIR 2003, pp. 88–95 (2003)

    Google Scholar 

  2. Aslam, J.A., Montague, M.H.: Models for Metasearch. In: Proceedings of SIGIR 2001, pp. 275–284 (2001)

    Google Scholar 

  3. Bartell, B.T., Cottrell, G.W., Belew, R.K.: Automatic combination of multiple ranked retrieval systems. In: Proceedings of SIGIR 1994, pp. 173–181 (1994)

    Google Scholar 

  4. Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: Proceedings of SIGKDD 2000, pp. 407–416 (2000)

    Google Scholar 

  5. Buckley, C., Salton, G., Allan, J., Singhal, A.: Automatic Query Expansion Using SMART: TREC 3. TREC 1994, 69–80 (1994)

    Google Scholar 

  6. Cui, H., Wen, J.R., Nie, J.Y., Ma, W.Y.: Probabilistic query expansion using query logs. In: Proceedings of WWW 2002, pp. 325–332 (2002)

    Google Scholar 

  7. Dumais, S., Joachims, T., Bharat, K., Weigend, A.: SIGIR 2003 Workshop Report: Implicit Measures of User Interests and Preferences. SIGIR Forum 37(2), 50–54 (2003)

    Article  Google Scholar 

  8. Fox, E.A., Shaw, J.A.: Combination of multiple searches. In: Proceedings of TREC-2, pp. 243–249 (1994)

    Google Scholar 

  9. Greengrass, E.: Information Retrieval: a Survey (2000), http://www.cs.umbc.edu/cadip/readings/IR.report.120600.book.pdf

  10. Hull, D.A.: Using Statistical Testing in the Evaluation of Retrieval Experiments. In: Proceedings of SIGIR 1993, pp. 329–338 (1993)

    Google Scholar 

  11. Joachims, T.: Optimizing search engines using clickthrough data. In: Proceedings of SIGKDD 2002, pp. 133–142 (2002)

    Google Scholar 

  12. Joachims, T., Granka, L., Pan, B., Hembrooke, H., Gay, G.: Accurately interpreting clickthrough data as implicit feedback. In: SIGIR 2005, pp. 154–161 (2005)

    Google Scholar 

  13. Lee, J.H.: Analyses of multiple evidence combination. In: Proceedings of SIGIR 1997, pp. 267–276 (1997)

    Google Scholar 

  14. Ling, C.X., Gao, J.F., Zhang, H.J., Qian, W.N., Zhang, H.J.: Improving Encarta search engine performance by mining user logs. International Journal of Pattern Recognition and Artificial Intelligence 16(8), 1101–1116 (2002)

    Article  Google Scholar 

  15. Manmatha, R., Rath, T., Feng, F.: Modeling score distributions for combining the outputs of search engines. In: Proceedings of SIGIR 2001, pp. 267–275 (2001)

    Google Scholar 

  16. Mitchell, T.M.: Machine Learning. McGraw Hill, New York (1997)

    MATH  Google Scholar 

  17. Oztekin, B., Karypis, G., Kumar, V.: Expert agreement and content based reranking in a meta search environment using mearf. In: Proceedings of WWW 2002, pp. 333–344 (2002)

    Google Scholar 

  18. Radlinski, F., Joachims, T.: Query chains: learning to rank from implicit feedback. In: KDD 2005, pp. 239–248 (2005)

    Google Scholar 

  19. Robertson, S.E., Walker, S., Jones, S., Hancock-Beaulieu, M.M., Gatford, M.: Okapi at TREC-3. In: Overview of the Third Text REtrieval Conference (TREC-3), pp. 109–126 (1995)

    Google Scholar 

  20. Salton, G., Buckley, C.: Improving retrieval performance by relevance feedback. Journal of American Society for Information Sciences 41, 288–297 (1990)

    Article  Google Scholar 

  21. Shen, X., Tan, B., Zhai, C.: Context-sensitive information retrieval using implicit feedback. In: SIGIR 2005, pp. 43–50 (2005)

    Google Scholar 

  22. Silverstein, C., Henzinger, M., Marais, H., Moricz, M.: Analysis of a Very Large AltaVista Query Log. Technical Report SRC 1998-014, Digital Systems Research Center (1998)

    Google Scholar 

  23. Spink, A., Jansen, B.J., Wolfram, D., Saracevic, T.: From e-sex to e-commerce: web search changes. IEEE Computer 35(3), 107–109 (2002)

    Google Scholar 

  24. Spink, A., Wolfram, D., Jansen, B.J., Saracevic, T.: Searching the web: the public and their queries. Journal of the American Society of Information Science and Technology 52(3), 226–234 (2001)

    Article  Google Scholar 

  25. Vogt, C.C., Cottrell, G.W.: Predicting the performance of linearly combined IR systems. In: Proceedings of SIGIR 1998, pp. 190–196 (1998)

    Google Scholar 

  26. White, R.W., Ruthven, I., Jose, J.M.: The use of implicit evidence for relevance feedback in web retrieval. In: Crestani, F., Girolami, M., van Rijsbergen, C.J.K. (eds.) ECIR 2002. LNCS, vol. 2291, pp. 93–109. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhao, M., Li, H., Ratnaparkhi, A., Hon, HW., Wang, J. (2006). Adapting Document Ranking to Users’ Preferences Using Click-Through Data. In: Ng, H.T., Leong, MK., Kan, MY., Ji, D. (eds) Information Retrieval Technology. AIRS 2006. Lecture Notes in Computer Science, vol 4182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11880592_3

Download citation

  • DOI: https://doi.org/10.1007/11880592_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-45780-0

  • Online ISBN: 978-3-540-46237-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics