research-article

Sparse hidden-dynamics conditional random fields for user intent understanding

Authors:
Yelong Shen

Beihang University, Beijing, China

Beihang University, Beijing, China
View Profile

,
Jun Yan

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Shuicheng Yan

National University of Singapore, Singapore, Singapore

National University of Singapore, Singapore, Singapore
View Profile

,
Lei Ji

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

,
Ning Liu

Microsoft Research Asia, China, Beijing, China

Microsoft Research Asia, China, Beijing, China
View Profile

,
Zheng Chen

Microsoft Research Asia, Beijing, China

Microsoft Research Asia, Beijing, China
View Profile

WWW '11: Proceedings of the 20th international conference on World wide webMarch 2011Pages 7–16https://doi.org/10.1145/1963405.1963411

Published:28 March 2011Publication History

WWW '11: Proceedings of the 20th international conference on World wide web

Pages 7–16

ABSTRACT

Understanding user intent from her sequential search behaviors, i.e. predicting the intent of each user query in a search session, is crucial for modern Web search engines. However, due to the huge number of user behavior variables and coarse level intent labels defined by human editors, it is very difficult to directly model user behavioral dynamics or user intent dynamics in user search sessions. In this paper, we propose a novel Sparse Hidden-Dynamic Conditional Random Fields (SHDCRF) model for user intent learning from their search sessions. Through incorporating the proposed hidden state variables, SHDCRF aims to learn a substructure, i.e. a set of related hidden variables, for each intent label and they are used to model the intermediate dynamics between user intent labels and user behavioral variables. In addition, SHDCRF learns a sparse relation between the hidden variables and intent labels to make the hidden state variables explainable. Extensive experiment results, on real user search sessions from a popular commercial search engine show that the proposed SHDCRF model significantly outperforms in terms of intent prediction results that those classical solutions such as Support Vector Machine (SVM), Conditional Random Field (CRF) and Latnet-Dynamic Conditional Random Field (LDCRF).

References

E. Agichtein, Eric Brill and S. Dumais. Improving Web search ranking by incorporating user behavior information. In SIGIR' 06, pp. 19--26. Google ScholarDigital Library
E. Agichtein, Eric Brill, S. Dumais and R. Ragno. Learning User Interaction Models for Predicting Web Search Result Preferences, In SIGIR'06, pp. 3--10. Google ScholarDigital Library
Z. B. Andrei, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski and T. Zhang. Robust Classification of Rare Queries Using Web Knowledge. In SIGIR'07, pp. 231--238. Google ScholarDigital Library
H. Cao, D. Jiang, J. Pei, Q. He and Z. Liao, E. Chen and H. Li. Context-Aware Query Suggestion By Mining Click-Through and session Data. In SIGKDD'08. Google ScholarDigital Library
H. Cao, D. Jiang, J. Pei, E. Chen and H. Li. Towards context-aware search by learning a very large variable length hidden markov model from search logs. In WWW'09. Google ScholarDigital Library
H. Cao, D. H. Hu, D. Shen, D. Jiang, J. T. Sun, E. Chen and Q. Yang. Context-aware query classification. In SIGIR'09. Google ScholarDigital Library
Z. Cheng, B. Gao and T. Y. Liu. Actively predicting diverse search intent from user browsing behaviors. In WWW'10. Google ScholarDigital Library
C. Cortes and V. Vapnik. Support-Vector Networks. Machine Learning, 1995, Vol. 20, pp.273--297. Google ScholarDigital Library
E. R. Daniel and Danny Levinson. Understanding User Goals in Web Search. In WWW'04, pp. 13--19. Google ScholarDigital Library
G. David, D. Nichols, M. Brain and O. D. Terry. Using collaborative filtering to weave an information tapestry. Communications of the ACM, 1992, vol.12, pp. 61--70. Google ScholarDigital Library
V. Ganti, A.K. Christian and X. Li. Precomputing search features for fast and accurate query classification. In WSDM'10. Google ScholarDigital Library
Gunawardana, M. Milind, A. Alex, and J.C. Platt. Hidden Conditional Random Fields for Phone Classification. International Conference on Speech Communication and Technology(ICSCT), 2005.Google Scholar
Hassan, R. Jones, and K. L. Klinkner. Beyond DCG, User Behavior as a Predictor of a Successful Search. In WSDM'10. Google ScholarDigital Library
Q. He, D. Jiang, Z. Liao, C. H. Steven, K. Chang, E. P. Lim and H. Li. Web Query Recommendation via Sequential Query Prediction. In ICDE'09. Google ScholarDigital Library
J. Hu, G. Wang, F. Lochovsky, J. T. Sun and Z. Chen. Understanding user's query intent with wikipedia. In WWW'09. Google ScholarDigital Library
D. H. Hu, D. Shen, J. T. Sun, Q. Yang and Z. Chen. Context-Aware Online Commercial Intention Detection. LNCS(5829), 2009, pp. 135--149. Google ScholarDigital Library
D. Jeffrey and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Operating Systems Design and Implementation(OSDI), 2004, pp. 137--150.Google Scholar
T. Joachims. SVM-light Support Vector Machine. http://svmlight.joachims.org/Google Scholar
I. H. Kang, G. C. Kim. Query Type Classification for web document Retrieval. In SIGIR'03. Google ScholarDigital Library
J. Lafferty, A. McCallum and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML' 01. Google ScholarDigital Library
Lewis and David. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In ECML'98, pp.4--15. Google ScholarDigital Library
X. Li, Y. Y. W., Alex Acero. Learning Query Intent from Regularized Click Graphs. In SIGIR'08. Google ScholarDigital Library
D. C. Liu and J. Nocedal. On the Limited Memory Method for Large Scale Optimization. Mathematical Programming, 1989. pp. 503--528. Google ScholarDigital Library
S. Martigny and T. Artieres. Neural conditional random fields. In AISTATS'10, pp.177--184.Google Scholar
H. Nguyen. Capturing User Intent For Information Retrieval. In AAAI'04. Google ScholarDigital Library
N. Okazaki. CRFsuit. A fast implementation of Conditional Random Fields. http://www.chokkan.org/software/crfsuite/.Google Scholar
J. Pearl. Probabilistic Reasoning in Intellignet Systems: Networks of Plausible Inference. Morgan Kaufmann, 1998. Google ScholarDigital Library
J. Peng, L. F. Bo and J. B. Xu. Conditional Neural Fields. In NIPS'09.Google Scholar
L. M. Philippe, A. Quattoni and T. Darrell. Latent-Dynamic Discriminative Models for Continuous Gesture Recognition. In CVPR'07.Google Scholar
Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. In NIPS'04.Google Scholar
B. Y. Ricardo, C. Hurtado and M. Mendoza. Query Clustering for Boosting Web Page Ranking. LNCS(3034), 2004, pp.164--175.Google Scholar
S. Sarawagi and William W. Cohen. Semi-Markov conditional random fields for information extraction. In NIPS'04.Google Scholar
K. Seymore, A. McCallum and R. Rosenfeld. Learning Hidden Markov Model Structure for Information Extraction. In AAAI'99 Workshop on Machine Learning for Information Extraction, 2009.Google Scholar
D. Shen, J. T. Sun, Q. Yang, and Z. Chen. Building Bridges for Web Query Classification. In SIGIR'06. Google ScholarDigital Library
D. Shen and R. Pan. Query Enrichment for Web-Query Classification. ACM Transactions on Information System 2006. Google ScholarDigital Library
M. B. Steven, C. J. Eric, F. Ophir, D. L. David, C. Abdur, K. Aleksander. Improving Automatic Query Classification via Semi-Supervised Learning. In ICDM'05.Google Scholar
M. B. Steven and C. J. Eric. Automatic Classification of Web Queries Using Very Large Unlabeled Query Logs. In TOIS'06, vol. 24, pp. 320--352.Google Scholar
Sutton and A. McCallum. Collective Segmentation and Labeling of Distant Entities in Information Extraction. In ICML workshop on Statistical Relational Learning, 2004.Google Scholar
Sutton, K. Rohanimanesh and A. McCallum. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. In ICML'04. Google ScholarDigital Library
Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 2003, Vol.13, pp. 260--269. Google ScholarDigital Library
J. Wang, A. de Vries and M. Reinders. A User-Item Relevance Model for Log-Based Collaborative Filtering. LNCS3936 (January 2006), pp. 37--48. Google ScholarDigital Library
S. B. Wang, A. Quattini, L. P. Morency and D. Demirdjian. Hidden Conditional Random Fields for Gesture Recognition. In CVPR'06. Google ScholarDigital Library
Yu, L. Deng, and S. Wang. Learning in the Deep-Structured Conditional Random Fields. In NIPS'09.Google Scholar
X. H. Zhang. Building Maximum Entropy Text Classifier Using Semi-Supervised Learning. PhD thesis, NUS, 2004.Google Scholar
Thomas G. Dietterich. Machine Learning for Sequential Data: A Review. LNCS(2396), 2002, pp. 15--30. Google ScholarDigital Library
H. Zou, T. Hastie and R. Tibshirani. Sparse Principal Component Analysis. Journal of Computational and Graphical Statistics, 2006.Google ScholarCross Ref
J. Mairal, F. Bach, J. Ponce and G. Sapiro. Online dictionary learning for sparse coding. In ICML'09. Google ScholarDigital Library

Index Terms

Sparse hidden-dynamics conditional random fields for user intent understanding
1. Information systems
  1. Information retrieval

Recommendations

Hierarchical hidden conditional random fields for information extraction
LION'05: Proceedings of the 5th international conference on Learning and Intelligent Optimization

Hidden Markov Models (HMMs) are very popular generative models for time series data. Recent work, however, has shown that for many tasks Conditional Random Fields (CRFs), a type of discriminative model, perform better than HMMs. Information extraction ...
Read More
A Conditional Random Field with Loop and Its Inference Algorithm
ISDEA '12: Proceedings of the 2012 Second International Conference on Intelligent System Design and Engineering Application

A new algorithm for human motion Recognition based on Conditional Random Fields (CRFs) and Hidden Markov Models (HMM) -- HMCRF is proposed. Most existing approaches to human motion recognition with hidden states employ a Hidden Markov Model or suitable ...
Read More
User Intent in Multimedia Search: A Survey of the State of the Art and Future Challenges

Today's multimedia search engines are expected to respond to queries reflecting a wide variety of information needs from users with different goals. The topical dimension (“what” the user is searching for) of these information needs is well studied; ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WWW '11: Proceedings of the 20th international conference on World wide web
March 2011
840 pages
ISBN:9781450306324
DOI:10.1145/1963405
General Chairs:
S. Sadagopan
IIIT-Bangalore, India
,
Krithi Ramamritham
IIT-Bombay, India
,
Arun Kumar
IBM Research, India
,
M. P. Ravindra
Infosys E & R, India
,
Program Chairs:
Elisa Bertino
Purdue University, USA
,
Ravi Kumar
Yahoo! Research, USA
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 28 March 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
conditional random field
hidden variable
sparse hidden-dynamic
user intent
user search session
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate1,899of8,196submissions,23%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 19
  Total Citations
  View Citations
- 637
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Sparse hidden-dynamics conditional random fields for user intent understanding

WWW '11: Proceedings of the 20th international conference on World wide web

ABSTRACT

References

Cited By

Index Terms

Recommendations

Hierarchical hidden conditional random fields for information extraction

A Conditional Random Field with Loop and Its Inference Algorithm

User Intent in Multimedia Search: A Survey of the State of the Art and Future Challenges