ABSTRACT
Query reformulation has been studied as a domain independent task. Existing work attempts to expand a query or substitute its terms with the same set of candidates regardless of the domain of this query. Since terms might be semantically related in one domain but not in others, it is more effective to provide candidates for queries with respect to their domain. This paper demonstrates the advantage of this domain dependent query reformulation approach, which learns its candidates, using a standard technique, for each domain from a separate sample of data derived automatically from a generic query log. Our results show that our approach statistically significantly outperforms the domain independent approach, which learns to reformulate from the same log using the same technique, on a large query set consisting of both health and commerce queries. Our results have very practical interpretation: while building different reformulation systems to handle queries from different domains does not require additional manual effort, it provides substantially better retrieval effectiveness than having a single system handling all queries. Additionally, we show that leveraging domain specific manually labelled data leads to further improvement.
- F. Ahmad and G. Kondrak. Learning a spelling error model from search query logs. In Proceedings of HLT, pages 955--962, 2005. Google ScholarDigital Library
- R. Baeza-Yates, C. Hurtado and M. Mendoza. Query Recommendation Using Query Logs in Search Engines. In The ClustWeb Workshop, pages 588--596, 2004. Google ScholarDigital Library
- D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In Proceedings of KDD, pages 407--416, 2000. Google ScholarDigital Library
- Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. The Mathematics of Statistical Machine Translation: Parameter Estimation. In Computational Linguistics, 19(2):263--311, 1993. Google ScholarDigital Library
- S. Cucerzan and E. Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users.In Proceedings of EMNLP, pages 293--300, 2004.Google Scholar
- Q. Chen, M. Li, and M. Zhou. Improving query spelling correction using web search results. In Proceedings of EMNLP-CoNLL, pages 181--189, 2007.Google Scholar
- H. Cui, J.R. Wen, J.Y. Nie, and W.Y. Ma. Probabilistic query expansion using query logs. In Proceedings of WWW, pages 325--332, 2002. Google ScholarDigital Library
- V. Dang and W.B. Croft. Query Reformulation Using Anchor Text. In Proceedings of WSDM, pages 41--50, 2010. Google ScholarDigital Library
- B.M. Fonseca, G. Paulo, P. Bruno, R.N. Berthier, and Z. Nivio. Concept-based interactive query expansion. In Proceedings CIKM, pages 696--703, 2005. Google ScholarDigital Library
- J. Guo, G. Xu, H. Li, and X. Cheng. A unified and discriminative model for query refinement. In Proceedings SIGIR, pages 379--386, 2008. Google ScholarDigital Library
- J. Gao, W. Yuan, X. Li, K. Deng and J.Y. Nie.Smoothing Clickthrough Data for Web Search Ranking. In Proceedings of SIGIR, pages 355--362, 2009. Google ScholarDigital Library
- S. Gollapudi, S. Ieong, A. Ntoulas and S. Paparizos. Efficient query rewrite for structured web queries.In Proceedings of CIKM, pages 2417--2420, 2011. Google ScholarDigital Library
- K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of IR techniques. In ACM Transactions on Information Systems, 20(4):422--446, 2002. Google ScholarDigital Library
- T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD, pages 133--142, 2002. Google ScholarDigital Library
- R. Jones and D.C. Fain. Query Word Deletion Prediction. In Proceedings of SIGIR, pages 435--436, 2003. Google ScholarDigital Library
- R. Jones, B. Rey and O. Madani. Generating query Substitutions. In Proceedings of WWW, pages 387--396, 2006. Google ScholarDigital Library
- R. Jones, and K.L. Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In Proceedings of CIKM, pages 699--708, 2008. Google ScholarDigital Library
- M. Li, M. Zhu, Y. Zhang, and M. Zhou. Exploring distributional similarity based models for query spelling correction. In Proceedings of COLING-ACL, pages 1025--1032, 2006. Google ScholarDigital Library
- Q. Mei, D. Zhou and K. Church. Query Suggestion Using Hitting Time. In Proceedings of CIKM, pages 469--477, 2008. Google ScholarDigital Library
- F.J. Och and N. Hermann. A systematic comparison of various statistical alignment models. In Journal of Computational Linguistics, 29(1):19--51, 2003. Google ScholarDigital Library
- F.J. Och and N. Hermann. The alignment template approach to statistical machine translation. In Journal of Computational Linguistics, 30(4):417--449, 2004. Google ScholarDigital Library
- D. Panigrahi and S. Gollapudi. Result enrichment in commerce search using browse trails. In Proceedings of WSDM, pages 267--276, 2011. Google ScholarDigital Library
- F. Peng, N. Ahmed, X. Li, and Y. Lu. Context sensitive stemming for web search. In Proceedings of SIGIR, pages 639--646, 2007. Google ScholarDigital Library
- F. Radlinski and T. Joachims.Query Chains: Learning to Rank from Implicit Feedback. In Proceedings of KDD, pages 239--248, 2005. Google ScholarDigital Library
- S. Riezler and Y. Liu. Query Rewriting Using Monolingual Statistical Machine Translation. In Association for Computational Linguistics, 36(3):569--582, 2010. Google ScholarDigital Library
- A. Spink, B.J. Jansen, and H. C. Ozmultu. Use of Query Reformulation and Relevance Feedback by Excite Users. In Internet Research: Electronic Networking Applications and Policy, 10(4):317--328, 2000.Google ScholarCross Ref
- X. Wang and C. Zhai. Mining term association patterns from search logs for effective query reformulation. In Proceedings of CIKM, pages 479--488, 2008. Google ScholarDigital Library
- J. Wen, J.Y. Nie and H.J. Zhang. Clustering User Queries of a Search Engine. In Proceedings of WWW, pages 162--168, 2001. Google ScholarDigital Library
Index Terms
- Domain dependent query reformulation for web search
Recommendations
Query reformulation using anchor text
WSDM '10: Proceedings of the third ACM international conference on Web search and data miningQuery reformulation techniques based on query logs have been studied as a method of capturing user intent and improving retrieval effectiveness. The evaluation of these techniques has primarily, however, focused on proprietary query logs and selected ...
Location-aware query reformulation for search engines
Query reformulation, including query recommendation and query auto-completion, is a popular add-on feature of search engines, which provide related and helpful reformulations of a keyword query. Due to the dropping prices of smartphones and the ...
Comments