skip to main content
10.1145/2009916.2009928acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

Learning search tasks in queries and web pages via graph regularization

Published: 24 July 2011 Publication History

Abstract

As the Internet grows explosively, search engines play a more and more important role for users in effectively accessing online information. Recently, it has been recognized that a query is often triggered by a search task that the user wants to accomplish. Similarly, many web pages are specifically designed to help accomplish a certain task. Therefore, learning hidden tasks behind queries and web pages can help search engines return the most useful web pages to users by task matching. For instance, the search task that triggers query "thinkpad T410 broken" is to maintain a computer, and it is desirable for a search engine to return the Lenovo troubleshooting page on the top of the list. However, existing search engine technologies mainly focus on topic detection or relevance ranking, which are not able to predict the task that triggers a query and the task a web page can accomplish.
In this paper, we propose to simultaneously classify queries and web pages into the popular search tasks by exploiting their content together with click-through logs. Specifically, we construct a taskoriented heterogeneous graph among queries and web pages. Each pair of objects in the graph are linked together as long as they potentially share similar search tasks. A novel graph-based regularization algorithm is designed for search task prediction by leveraging the graph. Extensive experiments in real search log data demonstrate the effectiveness of our method over state-of-the-art classifiers, and the search performance can be significantly improved by using the task prediction results as additional information.

References

[1]
M. Belkin, P. Niyogi, and V. Sindhwani. Manifold regularization: A geometric framework for learning from examples. Journal of Machine Learning Research, 7:2399--2434, 2006.
[2]
A. Broder. A taxonomy of web search. SIGIR Forum, 36(2):3--10, 2002.
[3]
S. Chakrabarti, B. Dom, and P. Indyk. Enhanced hypertext categorization using hyperlinks. In SIGMOD Conference, pages 307--318, 1998.
[4]
O. Chapelle, B. Schölkopf, and A. Zien, editors. Semi-Supervised Learning. MIT Press, 2006.
[5]
F. R. K. Chung. Spectral Graph Theory, volume 92 of Regional Conference Series in Mathematics. AMS, 1997.
[6]
Q. Gu and J. Zhou. Transductive classification via dual regularization. In ECML/PKDD (1), pages 439--454, 2009.
[7]
J. Guo, G. Xu, X. Cheng, and H. Li. Named entity recognition in query. In SIGIR, pages 267--274, 2009.
[8]
J. Hu, G. Wang, F. H. Lochovsky, J.-T. Sun, and Z. Chen. Understanding user's query intent with wikipedia. In WWW, pages 471--480, 2009.
[9]
B. J. Jansen, D. L. Booth, and A. Spink. Determining the informational, navigational, and transactional intent of web queries. Information Processing and Management, 44(3):1251--1266, 2008.
[10]
M. Ji, Y. Sun, M. Danilevsky, J. Han, and J. Gao. Graph regularized transductive classification on heterogeneous information networks. In ECML/PKDD (1), pages 570--586, 2010.
[11]
U. Lee, Z. Liu, and J. Cho. Automatic identification of user goals in web search. In WWW, pages 391--400, 2005.
[12]
X. Li, Y.-Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR, pages 339--346, 2008.
[13]
D. Mladenic. Turning yahoo to automatic web-page classifier. In European Conference on Artificial Intelligence, pages 473--474, 1998.
[14]
K. Nigam, J. Lafferty, and A. McCallum. Using maximum entropy for text classification. In IJCAI Workshop on Machine Learning for Information Filtering, pages 61--67, 1999.
[15]
M. Paşca. Organizing and searching the world wide web of facts -- step two: harnessing the wisdom of the crowds. In WWW, pages 101--110, 2007.
[16]
X. Qi and B. D. Davison. Web page classification: Features and algorithms. ACM Computing Surveys, 41(2):1--31, 2009.
[17]
D. E. Rose and D. Levinson. Understanding user goals in web search. In WWW, pages 13--19, 2004.
[18]
D. Shen, Y. Li, X. Li, and D. Zhou. Product query classification. In CIKM, pages 741--750, 2009.
[19]
D. Shen, J.-T. Sun, Q. Yang, and Z. Chen. A comparison of implicit and explicit links for web page classification. In WWW, pages 643--650, 2006.
[20]
G.-R. Xue, D. Shen, Q. Yang, H.-J. Zeng, Z. Chen, Y. Yu, W. Xi, and W.-Y. Ma. Irc: An iterative reinforcement categorization algorithm for interrelated web objects. In ICDM, pages 273--280, 2004.
[21]
X. Yin and S. Shah. Building taxonomy of web search intents for name entity queries. In WWW, pages 1001--1010, 2010.
[22]
Z. Yin, R. Li, Q. Mei, and J. Han. Exploring social tagging graph for web object classification. In KDD, pages 957--966, 2009.
[23]
D. Zhou, O. Bousquet, T. N. Lal, J. Weston, and B. Schölkopf. Learning with local and global consistency. In NIPS, 2003.
[24]
X. Zhu, Z. Ghahramani, and J. D. Lafferty. Semi-supervised learning using gaussian fields and harmonic functions. In ICML, pages 912--919, 2003.

Cited By

View all
  • (2019)Fast and accurate stream processing by filtering the coldThe VLDB Journal10.1007/s00778-019-00560-1Online publication date: 13-Aug-2019
  • (2018)Cold FilterProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183726(741-756)Online publication date: 27-May-2018
  • (2017)Utilizing Verbal Intent in Semantic Contextual AdvertisingIEEE Intelligent Systems10.1109/MIS.2017.4532:3(7-13)Online publication date: May-2017
  • Show More Cited By

Index Terms

  1. Learning search tasks in queries and web pages via graph regularization

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGIR '11: Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
      July 2011
      1374 pages
      ISBN:9781450307574
      DOI:10.1145/2009916
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 July 2011

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. classification
      2. graph regularization
      3. web search task

      Qualifiers

      • Research-article

      Conference

      SIGIR '11
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 792 of 3,983 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)1
      Reflects downloads up to 28 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2019)Fast and accurate stream processing by filtering the coldThe VLDB Journal10.1007/s00778-019-00560-1Online publication date: 13-Aug-2019
      • (2018)Cold FilterProceedings of the 2018 International Conference on Management of Data10.1145/3183713.3183726(741-756)Online publication date: 27-May-2018
      • (2017)Utilizing Verbal Intent in Semantic Contextual AdvertisingIEEE Intelligent Systems10.1109/MIS.2017.4532:3(7-13)Online publication date: May-2017
      • (2017)Faceted exploration of RDF/S datasetsJournal of Intelligent Information Systems10.1007/s10844-016-0413-848:2(329-364)Online publication date: 1-Apr-2017
      • (2017)An Active Learning Approach to Recognizing Domain-Specific Queries From Query LogWeb and Big Data10.1007/978-3-319-63564-4_2(18-32)Online publication date: 3-Aug-2017
      • (2016)Category Oriented Task ExtractionProceedings of the 2016 ACM on Conference on Human Information Interaction and Retrieval10.1145/2854946.2854997(333-336)Online publication date: 13-Mar-2016
      • (2015)Trend Query Classification using Label PropagationTransactions of the Japanese Society for Artificial Intelligence10.1527/tjsai.30.16130:1(161-171)Online publication date: 2015
      • (2015)Constructing Complex Search Tasks with Coherent Subtask Search GoalsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/274254715:2(1-29)Online publication date: 11-Dec-2015
      • (2015)Fine-Grained Knowledge Sharing in Collaborative EnvironmentsIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2015.241128327:8(2163-2174)Online publication date: 1-Aug-2015
      • (2015)Ranking on heterogeneous manifolds for tag recommendation in social tagging servicesNeurocomputing10.1016/j.neucom.2014.07.011148(521-534)Online publication date: Jan-2015
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media