skip to main content
10.1145/2556195.2556222acmconferencesArticle/Chapter ViewAbstractPublication PageswsdmConference Proceedingsconference-collections
research-article

Heterogeneous graph-based intent learning with queries, web pages and Wikipedia concepts

Published: 24 February 2014 Publication History

Abstract

The problem of learning user search intents has attracted intensive attention from both industry and academia. However, state-of-the-art intent learning algorithms suffer from different drawbacks when only using a single type of data source. For example, query text has difficulty in distinguishing ambiguous queries; search log is bias to the order of search results and users' noisy click behaviors. In this work, we for the first time leverage three types of objects, namely queries, web pages and Wikipedia concepts collaboratively for learning generic search intents and construct a heterogeneous graph to represent multiple types of relationships between them. A novel unsupervised method called heterogeneous graph-based soft-clustering is developed to derive an intent indicator for each object based on the constructed heterogeneous graph. With the proposed co-clustering method, one can enhance the quality of intent understanding by taking advantage of different types of data, which complement each other, and make the implicit intents easier to interpret with explicit knowledge from Wikipedia concepts. Experiments on two real-world datasets demonstrate the power of the proposed method where it achieves a 9.25% improvement in terms of NDCG on search ranking task and a 4.67% enhancement in terms of Rand index on object co-clustering task compared to the best state-of-the-art method.

References

[1]
L. M. Aiello, D. Donato, U. Ozertem, and F. Menczer. Behavior-driven clustering of queries into topics. In CIKM, 2011.
[2]
M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In NIPS, 2001.
[3]
J. C. Bezdek, R. Ehrlich, and W. Full. Fcm: The fuzzy c-means clustering algorithm. Computers & Geosciences, 10(2):191--203, 1984.
[4]
I. Bordino, G. De Francisci Morales, I. Weber, and F. Bonchi. From machu_picchu to rafting the urubamba river: anticipating information needs via the entity-query graph. In WSDM, 2013.
[5]
H. Cao, D. H. Hu, D. Shen, D. Jiang, J.-T. Sun, E. Chen, and Q. Yang. Context-aware query classification. In SIGIR, 2009.
[6]
J. C. K. Cheung and X. Li. Sequence clustering and labeling for unsupervised query intent discovery. In WSDM, 2012.
[7]
N. Craswell and M. Szummer. Random walks on the click graph. In SIGIR, 2007.
[8]
I. S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In SIGKDD, 2001.
[9]
E. Gabrilovich and S. Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI, 2007.
[10]
G. H. Golub and C. F. Van Loan. Matrix computations, volume 3. JHU Press, 2012.
[11]
Q. Gu and J. Zhou. Co-clustering on manifolds. In SIGKDD, 2009.
[12]
Z. Guan, C. Wang, J. Bu, C. Chen, K. Yang, D. Cai, and X. He. Document recommendation in social tagging services. In WWW, 2010.
[13]
M. Halkidi, Y. Batistakis, and M. Vazirgiannis. On clustering validation techniques. Journal of Intelligent Information Systems, 17(2--3):107--145, 2001.
[14]
X. He and P. Niyogi. Locality preserving projections. In NIPS, 2003.
[15]
J. Hu, G. Wang, F. Lochovsky, J. Sun, and Z. Chen. Understanding user's query intent with wikipedia. In WWW, 2009.
[16]
Y. Hu, Y. Qian, H. Li, D. Jiang, J. Pei, and Q. Zheng. Mining query subtopics from search log data. In SIGIR, 2012.
[17]
M. Ji, J. Yan, S. Gu, J. Han, X. He, W. V. Zhang, and Z. Chen. Learning search tasks in queries and web pages via graph regularization. In SIGIR, 2011.
[18]
X. Li, Y. Wang, and A. Acero. Learning query intent from regularized click graphs. In SIGIR, 2008.
[19]
J. Liu, C. Wang, J. Gao, and J. Han. Multi-view clustering via joint nonnegative matrix factorization. In SDM, 2013.
[20]
F. Radlinski, M. Szummer, and N. Craswell. Inferring query intent from reformulations and clicks. In WWW, 2010.
[21]
W. M. Rand. Objective criteria for the evaluation of clustering methods. JSTOR, 66(336):846--850, 1971.
[22]
E. Sadikov, J. Madhavan, L. Wang, and A. Halevy. Clustering query refinements by user intent. In WWW, 2010.
[23]
Y. Sun, Y. Yu, and J. Han. Ranking-based clustering of heterogeneous information networks with star network schema. In SIGKDD, 2009.
[24]
X. Wang, D. Chakrabarti, and K. Punera. Mining broad latent query aspects from search sessions. In SIGKDD, 2009.
[25]
W. Wu, H. Li, and J. Xu. Learning query and document similarities from click-through bipartite graph with metadata. In WSDM, 2013.
[26]
B. Xu, J. Bu, C. Chen, and D. Cai. An exploration of improving collaborative recommender systems via user-item subgroups. In WWW, 2012.
[27]
T. Yamamoto, T. Sakai, M. Iwata, C. Yu, J. Wen, and K. Tanaka. The wisdom of advertisers: mining subgoals via query clustering. In CIKM, 2012.
[28]
X. Yin and S. Shah. Building taxonomy of web search intents for name entity queries. In WWW, 2010.
[29]
H. Zeng, Q. He, Z. Chen, W. Ma, and J. Ma. Learning to cluster web search results. In SIGIR, 2004.
[30]
X. Zhu, J. Guo, X. Cheng, and Y. Lan. More than relevance: high utility query recommendation by mining users' search behaviors. In CIKM, 2012.

Cited By

View all
  • (2024)LLM-enhanced Cascaded Multi-level Learning on Temporal Heterogeneous GraphsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657731(512-521)Online publication date: 10-Jul-2024
  • (2022)IHGNN: Interactive Hypergraph Neural Network for Personalized Product SearchProceedings of the ACM Web Conference 202210.1145/3485447.3511954(256-265)Online publication date: 25-Apr-2022
  • (2021)Social media intention mining for sustainable information systems: categories, taxonomy, datasets and challengesComplex & Intelligent Systems10.1007/s40747-021-00342-99:3(2773-2799)Online publication date: 5-Apr-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
WSDM '14: Proceedings of the 7th ACM international conference on Web search and data mining
February 2014
712 pages
ISBN:9781450323512
DOI:10.1145/2556195
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2014

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. heterogeneous graph clustering
  2. search intent
  3. wikipedia

Qualifiers

  • Research-article

Conference

WSDM 2014

Acceptance Rates

WSDM '14 Paper Acceptance Rate 64 of 355 submissions, 18%;
Overall Acceptance Rate 498 of 2,863 submissions, 17%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LLM-enhanced Cascaded Multi-level Learning on Temporal Heterogeneous GraphsProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657731(512-521)Online publication date: 10-Jul-2024
  • (2022)IHGNN: Interactive Hypergraph Neural Network for Personalized Product SearchProceedings of the ACM Web Conference 202210.1145/3485447.3511954(256-265)Online publication date: 25-Apr-2022
  • (2021)Social media intention mining for sustainable information systems: categories, taxonomy, datasets and challengesComplex & Intelligent Systems10.1007/s40747-021-00342-99:3(2773-2799)Online publication date: 5-Apr-2021
  • (2020)Graph Theory: A Comprehensive Survey about Graph Theory Applications in Computer Science and Social NetworksInventions10.3390/inventions50100105:1(10)Online publication date: 20-Feb-2020
  • (2020)Heterogeneous-Graph-Based Video Search Reranking Using Topic RelevanceIEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences10.1587/transfun.2020SMP0023E103.A:12(1529-1540)Online publication date: 1-Dec-2020
  • (2020)Structural Relationship Representation Learning with Graph Embedding for Personalized Product SearchProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3411936(915-924)Online publication date: 19-Oct-2020
  • (2020)Coverage-based query subtopic diversification leveraging semantic relevanceKnowledge and Information Systems10.1007/s10115-020-01470-362:7(2873-2891)Online publication date: 27-Apr-2020
  • (2019)Multi-Objective GP Strategies for Topical Search Integrating Wikipedia ConceptsProceedings of the ACM Symposium on Document Engineering 201910.1145/3342558.3345402(1-10)Online publication date: 23-Sep-2019
  • (2019)Neural IR Meets Graph Embedding: A Ranking Model for Product SearchThe World Wide Web Conference10.1145/3308558.3313468(2390-2400)Online publication date: 13-May-2019
  • (2019)Deep Neural Architecture with Character Embedding for Semantic Frame Detection2019 IEEE 13th International Conference on Semantic Computing (ICSC)10.1109/ICOSC.2019.8665582(302-307)Online publication date: Jan-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media