skip to main content
10.1145/1935826.1935890acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
poster

Searchable web sites recommendation

Published: 09 February 2011 Publication History

Abstract

In this paper, we propose a new framework for searchable web sites recommendation. Given a query, our system will recommend a list of searchable web sites ranked by relevance, which can be used to complement the web page results and ads from a search engine. We model the conditional probability of a searchable web site being relevant to a given query in term of three main components: the language model of the query, the language model of the content within the web site, and the reputation of the web site searching capability (static rank). The language models for queries and searchable sites are built using information mined from client-side browsing logs. The static rank for each searchable site leverages features extracted from these client-side logs such as number of queries that are submitted to this site, and features extracted from general search engines such as the number of web pages that indexed for this site, number of clicks per query, and the dwell-time that a user spends on the search result page and on the clicked result web pages. We also learn a weight for each kind of feature to optimize the ranking performance. In our experiment, we discover 10.5 thousand searchable sites and use 5 million unique queries, extracted from one week of log data to build and demonstrate the effectiveness of our searchable web site recommendation system.

References

[1]
Netscape communication corporation. open directory project. http://www.dmoz.org.
[2]
J. Arguello, F. Diaz, J. Callan, and J.-F. Crespo. Sources of evidence for vertical selection. In The 32nd Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR09), 2009.
[3]
R. A. Baeza-Yates and B. Ribeiro-Neto. Modern Information Retrieval. Addison--Wesley Longman Publishing Co., Inc., Boston, MA, USA, 1999.
[4]
L. Barbosa and J. Freire. Combining classifiers to identify online databases. In WWW '07, pages 431--440, 2007.
[5]
S. M. Beitzel, E. C. Jensen, D. D. Lewis, A. Chowdhury, and O. Frieder. Automatic classification of web queries using very large unlabeled query logs. ACM Trans. Inf. Syst., 25(2):9, 2007.
[6]
S. K. Bhavnani. Domain-specific search strategies for the effective retrieval of healthcare and shopping information. In Conference on Human Factors in Computing Systems, pages 610--611, 2002.
[7]
D. J. Brenes, D. Gayo-Avello, and K. Perez-Gonzalez. Survey and evaluation of query intent detection methods. In The Workshop on Web Search Click Data (WSCD09), 2009.
[8]
W. Chang, P. Pantel, A.-M. Popescu, and E. Gabrilovich. Towards intent-driven bidterm suggestion. In WWW '09, pages 1093--1094, 2009.
[9]
J. Cope, N. Craswell, and D. Hawking. Automated discovery of search interfaces on the web. In ADC '03, pages 181--189, 2003.
[10]
F. Diaz. Integration of news content into web results. In WSDM '09, pages 182--191, New York, NY, USA, 2009, ACM.
[11]
F. Diaz and J. Arguello. Adaptation of offline vertical selection predictions in the presence of user feedback. In The 32nd Annual International ACM Conference on Research and Development in Information Retrieval (SIGIR09), 2009.
[12]
E. Diemert and G. Vandelle. Unsupervised query categorization using automatically--built concept graphs. In The 19th International World Wide Web Conference(WWW09), pages 461---470, 2009.
[13]
J. Hu, G. Wang, F. Lochovsky, J.--T. Sun, and Z. Chen. Understanding user's query intent with wikipedia. In The 19th International World Wide Web Conference(WWW09), pages 471---480, 2009.
[14]
B. J. Jansen, A. Spink, and T. Saracevic. Real life, real users, and real needs: a study and analysis of user queries on the web. Information Processing & Management, 36(2):207--227, March 2000.
[15]
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR08, pages 339--346, 2008.
[16]
D. J. C. Mackay and L. C. B. Peto. A hierarchical dirichlet language model. Natural Language Engineering, 1(1):289--307, 1995.
[17]
J. Madhavan, D. Ko, L. Kot, V. Ganapathy, A. Rasmussen, and A. Halevy. Google's deep web crawl. Proceedings of the VLDB Endowment, 1(2):1241--1252, 2008.
[18]
G. Salton and C. Buckley. Term-weighting approaches in automatic text retrieval. Information Processing and Managment, 24(5):513--523, 1988.
[19]
D. Shen, J.-T. Sun, Q. Yang, and Z. Chen. Building bridges for web query classification. In SIGIR '06, pages 131--138. ACM, 2006.
[20]
J. C. Spall. Multivariate stochastic approximation using a simultaneous perturbation gradient approximation. IEEE Transactions on Automatic Control, 37:332--341, 1992.
[21]
J. C. Spall. Implementation of the simultaneous perturbation algorithm for stochastic approximation. IEEE Transactions on Aerospace and Electronic Systems, 34:817--823, 1998.
[22]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems, 22(2):179--214, April 2004.

Cited By

View all
  • (2016)Learning hostname preference to enhance search relevanceProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3061053.3061165(3903-3909)Online publication date: 9-Jul-2016
  • (2012)Characterizing web content, user interests, and search behavior by reading level and topicProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124323(213-222)Online publication date: 8-Feb-2012
  • (2011)Enhanced information retrieval using domain-specific recommender modelsProceedings of the Third international conference on Advances in information retrieval theory10.5555/2040317.2040343(201-212)Online publication date: 12-Sep-2011
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '11: Proceedings of the fourth ACM international conference on Web search and data mining
February 2011
870 pages
ISBN:9781450304931
DOI:10.1145/1935826
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2011

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. vertical search engines

Qualifiers

  • Poster

Conference

Acceptance Rates

WSDM '11 Paper Acceptance Rate 83 of 372 submissions, 22%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 20 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Learning hostname preference to enhance search relevanceProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence10.5555/3061053.3061165(3903-3909)Online publication date: 9-Jul-2016
  • (2012)Characterizing web content, user interests, and search behavior by reading level and topicProceedings of the fifth ACM international conference on Web search and data mining10.1145/2124295.2124323(213-222)Online publication date: 8-Feb-2012
  • (2011)Enhanced information retrieval using domain-specific recommender modelsProceedings of the Third international conference on Advances in information retrieval theory10.5555/2040317.2040343(201-212)Online publication date: 12-Sep-2011
  • (2011)Enhanced Information Retrieval Using Domain-Specific Recommender ModelsAdvances in Information Retrieval Theory10.1007/978-3-642-23318-0_19(201-212)Online publication date: 2011

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media