ABSTRACT
Understanding user intent from her sequential search behaviors, i.e. predicting the intent of each user query in a search session, is crucial for modern Web search engines. However, due to the huge number of user behavior variables and coarse level intent labels defined by human editors, it is very difficult to directly model user behavioral dynamics or user intent dynamics in user search sessions. In this paper, we propose a novel Sparse Hidden-Dynamic Conditional Random Fields (SHDCRF) model for user intent learning from their search sessions. Through incorporating the proposed hidden state variables, SHDCRF aims to learn a substructure, i.e. a set of related hidden variables, for each intent label and they are used to model the intermediate dynamics between user intent labels and user behavioral variables. In addition, SHDCRF learns a sparse relation between the hidden variables and intent labels to make the hidden state variables explainable. Extensive experiment results, on real user search sessions from a popular commercial search engine show that the proposed SHDCRF model significantly outperforms in terms of intent prediction results that those classical solutions such as Support Vector Machine (SVM), Conditional Random Field (CRF) and Latnet-Dynamic Conditional Random Field (LDCRF).
- E. Agichtein, Eric Brill and S. Dumais. Improving Web search ranking by incorporating user behavior information. In SIGIR' 06, pp. 19--26. Google ScholarDigital Library
- E. Agichtein, Eric Brill, S. Dumais and R. Ragno. Learning User Interaction Models for Predicting Web Search Result Preferences, In SIGIR'06, pp. 3--10. Google ScholarDigital Library
- Z. B. Andrei, M. Fontoura, E. Gabrilovich, A. Joshi, V. Josifovski and T. Zhang. Robust Classification of Rare Queries Using Web Knowledge. In SIGIR'07, pp. 231--238. Google ScholarDigital Library
- H. Cao, D. Jiang, J. Pei, Q. He and Z. Liao, E. Chen and H. Li. Context-Aware Query Suggestion By Mining Click-Through and session Data. In SIGKDD'08. Google ScholarDigital Library
- H. Cao, D. Jiang, J. Pei, E. Chen and H. Li. Towards context-aware search by learning a very large variable length hidden markov model from search logs. In WWW'09. Google ScholarDigital Library
- H. Cao, D. H. Hu, D. Shen, D. Jiang, J. T. Sun, E. Chen and Q. Yang. Context-aware query classification. In SIGIR'09. Google ScholarDigital Library
- Z. Cheng, B. Gao and T. Y. Liu. Actively predicting diverse search intent from user browsing behaviors. In WWW'10. Google ScholarDigital Library
- C. Cortes and V. Vapnik. Support-Vector Networks. Machine Learning, 1995, Vol. 20, pp.273--297. Google ScholarDigital Library
- E. R. Daniel and Danny Levinson. Understanding User Goals in Web Search. In WWW'04, pp. 13--19. Google ScholarDigital Library
- G. David, D. Nichols, M. Brain and O. D. Terry. Using collaborative filtering to weave an information tapestry. Communications of the ACM, 1992, vol.12, pp. 61--70. Google ScholarDigital Library
- V. Ganti, A.K. Christian and X. Li. Precomputing search features for fast and accurate query classification. In WSDM'10. Google ScholarDigital Library
- Gunawardana, M. Milind, A. Alex, and J.C. Platt. Hidden Conditional Random Fields for Phone Classification. International Conference on Speech Communication and Technology(ICSCT), 2005.Google Scholar
- Hassan, R. Jones, and K. L. Klinkner. Beyond DCG, User Behavior as a Predictor of a Successful Search. In WSDM'10. Google ScholarDigital Library
- Q. He, D. Jiang, Z. Liao, C. H. Steven, K. Chang, E. P. Lim and H. Li. Web Query Recommendation via Sequential Query Prediction. In ICDE'09. Google ScholarDigital Library
- J. Hu, G. Wang, F. Lochovsky, J. T. Sun and Z. Chen. Understanding user's query intent with wikipedia. In WWW'09. Google ScholarDigital Library
- D. H. Hu, D. Shen, J. T. Sun, Q. Yang and Z. Chen. Context-Aware Online Commercial Intention Detection. LNCS(5829), 2009, pp. 135--149. Google ScholarDigital Library
- D. Jeffrey and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Operating Systems Design and Implementation(OSDI), 2004, pp. 137--150.Google Scholar
- T. Joachims. SVM-light Support Vector Machine. http://svmlight.joachims.org/Google Scholar
- I. H. Kang, G. C. Kim. Query Type Classification for web document Retrieval. In SIGIR'03. Google ScholarDigital Library
- J. Lafferty, A. McCallum and F. Pereira. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In ICML' 01. Google ScholarDigital Library
- Lewis and David. Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval. In ECML'98, pp.4--15. Google ScholarDigital Library
- X. Li, Y. Y. W., Alex Acero. Learning Query Intent from Regularized Click Graphs. In SIGIR'08. Google ScholarDigital Library
- D. C. Liu and J. Nocedal. On the Limited Memory Method for Large Scale Optimization. Mathematical Programming, 1989. pp. 503--528. Google ScholarDigital Library
- S. Martigny and T. Artieres. Neural conditional random fields. In AISTATS'10, pp.177--184.Google Scholar
- H. Nguyen. Capturing User Intent For Information Retrieval. In AAAI'04. Google ScholarDigital Library
- N. Okazaki. CRFsuit. A fast implementation of Conditional Random Fields. http://www.chokkan.org/software/crfsuite/.Google Scholar
- J. Pearl. Probabilistic Reasoning in Intellignet Systems: Networks of Plausible Inference. Morgan Kaufmann, 1998. Google ScholarDigital Library
- J. Peng, L. F. Bo and J. B. Xu. Conditional Neural Fields. In NIPS'09.Google Scholar
- L. M. Philippe, A. Quattoni and T. Darrell. Latent-Dynamic Discriminative Models for Continuous Gesture Recognition. In CVPR'07.Google Scholar
- Quattoni, M. Collins, and T. Darrell. Conditional random fields for object recognition. In NIPS'04.Google Scholar
- B. Y. Ricardo, C. Hurtado and M. Mendoza. Query Clustering for Boosting Web Page Ranking. LNCS(3034), 2004, pp.164--175.Google Scholar
- S. Sarawagi and William W. Cohen. Semi-Markov conditional random fields for information extraction. In NIPS'04.Google Scholar
- K. Seymore, A. McCallum and R. Rosenfeld. Learning Hidden Markov Model Structure for Information Extraction. In AAAI'99 Workshop on Machine Learning for Information Extraction, 2009.Google Scholar
- D. Shen, J. T. Sun, Q. Yang, and Z. Chen. Building Bridges for Web Query Classification. In SIGIR'06. Google ScholarDigital Library
- D. Shen and R. Pan. Query Enrichment for Web-Query Classification. ACM Transactions on Information System 2006. Google ScholarDigital Library
- M. B. Steven, C. J. Eric, F. Ophir, D. L. David, C. Abdur, K. Aleksander. Improving Automatic Query Classification via Semi-Supervised Learning. In ICDM'05.Google Scholar
- M. B. Steven and C. J. Eric. Automatic Classification of Web Queries Using Very Large Unlabeled Query Logs. In TOIS'06, vol. 24, pp. 320--352.Google Scholar
- Sutton and A. McCallum. Collective Segmentation and Labeling of Distant Entities in Information Extraction. In ICML workshop on Statistical Relational Learning, 2004.Google Scholar
- Sutton, K. Rohanimanesh and A. McCallum. Dynamic Conditional Random Fields: Factorized Probabilistic Models for Labeling and Segmenting Sequence Data. In ICML'04. Google ScholarDigital Library
- Viterbi. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory, 2003, Vol.13, pp. 260--269. Google ScholarDigital Library
- J. Wang, A. de Vries and M. Reinders. A User-Item Relevance Model for Log-Based Collaborative Filtering. LNCS3936 (January 2006), pp. 37--48. Google ScholarDigital Library
- S. B. Wang, A. Quattini, L. P. Morency and D. Demirdjian. Hidden Conditional Random Fields for Gesture Recognition. In CVPR'06. Google ScholarDigital Library
- Yu, L. Deng, and S. Wang. Learning in the Deep-Structured Conditional Random Fields. In NIPS'09.Google Scholar
- X. H. Zhang. Building Maximum Entropy Text Classifier Using Semi-Supervised Learning. PhD thesis, NUS, 2004.Google Scholar
- Thomas G. Dietterich. Machine Learning for Sequential Data: A Review. LNCS(2396), 2002, pp. 15--30. Google ScholarDigital Library
- H. Zou, T. Hastie and R. Tibshirani. Sparse Principal Component Analysis. Journal of Computational and Graphical Statistics, 2006.Google ScholarCross Ref
- J. Mairal, F. Bach, J. Ponce and G. Sapiro. Online dictionary learning for sparse coding. In ICML'09. Google ScholarDigital Library
Index Terms
- Sparse hidden-dynamics conditional random fields for user intent understanding
Recommendations
Hierarchical hidden conditional random fields for information extraction
LION'05: Proceedings of the 5th international conference on Learning and Intelligent OptimizationHidden Markov Models (HMMs) are very popular generative models for time series data. Recent work, however, has shown that for many tasks Conditional Random Fields (CRFs), a type of discriminative model, perform better than HMMs. Information extraction ...
A Conditional Random Field with Loop and Its Inference Algorithm
ISDEA '12: Proceedings of the 2012 Second International Conference on Intelligent System Design and Engineering ApplicationA new algorithm for human motion Recognition based on Conditional Random Fields (CRFs) and Hidden Markov Models (HMM) -- HMCRF is proposed. Most existing approaches to human motion recognition with hidden states employ a Hidden Markov Model or suitable ...
User Intent in Multimedia Search: A Survey of the State of the Art and Future Challenges
Today's multimedia search engines are expected to respond to queries reflecting a wide variety of information needs from users with different goals. The topical dimension (“what” the user is searching for) of these information needs is well studied; ...
Comments