skip to main content
10.1145/1772690.1772867acmotherconferencesArticle/Chapter ViewAbstractPublication PagesthewebconfConference Proceedingsconference-collections
poster

Sampling high-quality clicks from noisy click data

Published: 26 April 2010 Publication History

Abstract

Click data captures many users' document preferences for a query and has been shown to help significantly improve search engine ranking. However, most click data is noisy and of low frequency, with queries associated to documents via only one or a few clicks. This severely limits the usefulness of click data as a ranking signal. Given potentially noisy clicks comprising results with at most one click for a query, how do we extract high-quality clicks that may be useful for ranking? In this poster, we introduce a technique based on query entropy for noise reduction in click data. We study the effect of query entropy and as well as features such as user engagement and the match between the query and the document. Based on query entropy plus other features, we can sample noisy data to 15% of its overall size with 43% query recall and an average increase of 20% in precision for recalled queries.

References

[1]
Agichtein, E., Brill, E. & Dumais, S. Improving web search ranking by incorporating user behavior information. In SIGIR, 19--26, 2006.
[2]
Al-Maskari, A., Sanderson, M. & Clough, P. The relationship between IR effectiveness measures and user satisfaction. In SIGIR, 773--774, 2007.
[3]
Bilenko M. & White, R. W. Mining the search trails of surfing crowds: identifying relevant websites from user activity. In WWW, 51--60, 2008.
[4]
Deng H., King I. & Lyu, R. M. Entropy-biased models for query representation on the click graph. In SIGIR, 2009.
[5]
Dou, Z., Song, R. & Wen, J. R. A large-scale evaluation and analysis of personalized search strategies. In WWW, 2007.
[6]
Gao J. et al. Smoothing clickthrough data for web search ranking. In SIGIR, 355--362, 2009.
[7]
Järvelin, K. & Kekäläinen, J. IR evaluation methods for retrieving highly relevant documents. In SIGIR, 2000.
[8]
Joachims, T. Optimizing search engines using clickthrough data. In SIGKDD, 133--142, 2002.
[9]
Shannon, C. E. Prediction and entropy of printed english. The Bell System Technical Journal, 30: 50--64, 1950.
[10]
Xue, G. et al. Optimizing web search using web click-through data. In CIKM, 118--126, 2004.

Cited By

View all
  • (2024)Click-Through Rate Analysis for Understanding User Behavior in Information Retrieval2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10725365(1-7)Online publication date: 24-Jun-2024
  • (2022)Noise-Reduction for Automatically Transferred Relevance JudgmentsExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_4(48-61)Online publication date: 25-Aug-2022
  • (2017)Delving Deep into Personal Photo and Video SearchProceedings of the Tenth ACM International Conference on Web Search and Data Mining10.1145/3018661.3018736(801-810)Online publication date: 2-Feb-2017
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
WWW '10: Proceedings of the 19th international conference on World wide web
April 2010
1407 pages
ISBN:9781605587998
DOI:10.1145/1772690

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 April 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. click data
  2. noise elimination
  3. query entropy
  4. web search ranking

Qualifiers

  • Poster

Conference

WWW '10
WWW '10: The 19th International World Wide Web Conference
April 26 - 30, 2010
North Carolina, Raleigh, USA

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Click-Through Rate Analysis for Understanding User Behavior in Information Retrieval2024 15th International Conference on Computing Communication and Networking Technologies (ICCCNT)10.1109/ICCCNT61001.2024.10725365(1-7)Online publication date: 24-Jun-2024
  • (2022)Noise-Reduction for Automatically Transferred Relevance JudgmentsExperimental IR Meets Multilinguality, Multimodality, and Interaction10.1007/978-3-031-13643-6_4(48-61)Online publication date: 25-Aug-2022
  • (2017)Delving Deep into Personal Photo and Video SearchProceedings of the Tenth ACM International Conference on Web Search and Data Mining10.1145/3018661.3018736(801-810)Online publication date: 2-Feb-2017
  • (2015)An empirical evaluation of the User Engagement Scale (UES) in online news environmentsInformation Processing and Management: an International Journal10.1016/j.ipm.2015.03.00351:4(413-427)Online publication date: 1-Jul-2015
  • (2013)Examining the generalizability of the User Engagement Scale (UES) in exploratory searchInformation Processing and Management: an International Journal10.1016/j.ipm.2012.08.00549:5(1092-1107)Online publication date: 1-Sep-2013
  • (2013)Mixed‐methods approach to measuring user experience in online news interactionsJournal of the American Society for Information Science and Technology10.1002/asi.2287164:8(1543-1556)Online publication date: 19-Jun-2013
  • (2011)Exploiting query click logs for utterance domain detection in spoken language understanding2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2011.5947638(5636-5639)Online publication date: May-2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

EPUB

View this article in ePub.

ePub

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media