skip to main content
10.1145/2396761.2398401acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Domain dependent query reformulation for web search

Published:29 October 2012Publication History

ABSTRACT

Query reformulation has been studied as a domain independent task. Existing work attempts to expand a query or substitute its terms with the same set of candidates regardless of the domain of this query. Since terms might be semantically related in one domain but not in others, it is more effective to provide candidates for queries with respect to their domain. This paper demonstrates the advantage of this domain dependent query reformulation approach, which learns its candidates, using a standard technique, for each domain from a separate sample of data derived automatically from a generic query log. Our results show that our approach statistically significantly outperforms the domain independent approach, which learns to reformulate from the same log using the same technique, on a large query set consisting of both health and commerce queries. Our results have very practical interpretation: while building different reformulation systems to handle queries from different domains does not require additional manual effort, it provides substantially better retrieval effectiveness than having a single system handling all queries. Additionally, we show that leveraging domain specific manually labelled data leads to further improvement.

References

  1. F. Ahmad and G. Kondrak. Learning a spelling error model from search query logs. In Proceedings of HLT, pages 955--962, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. R. Baeza-Yates, C. Hurtado and M. Mendoza. Query Recommendation Using Query Logs in Search Engines. In The ClustWeb Workshop, pages 588--596, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. D. Beeferman and A. Berger. Agglomerative clustering of a search engine query log. In Proceedings of KDD, pages 407--416, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Peter F. Brown, Stephen A. Della Pietra, Vincent J. Della Pietra, and Robert L. Mercer. The Mathematics of Statistical Machine Translation: Parameter Estimation. In Computational Linguistics, 19(2):263--311, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. S. Cucerzan and E. Brill. Spelling correction as an iterative process that exploits the collective knowledge of web users.In Proceedings of EMNLP, pages 293--300, 2004.Google ScholarGoogle Scholar
  6. Q. Chen, M. Li, and M. Zhou. Improving query spelling correction using web search results. In Proceedings of EMNLP-CoNLL, pages 181--189, 2007.Google ScholarGoogle Scholar
  7. H. Cui, J.R. Wen, J.Y. Nie, and W.Y. Ma. Probabilistic query expansion using query logs. In Proceedings of WWW, pages 325--332, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. V. Dang and W.B. Croft. Query Reformulation Using Anchor Text. In Proceedings of WSDM, pages 41--50, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. B.M. Fonseca, G. Paulo, P. Bruno, R.N. Berthier, and Z. Nivio. Concept-based interactive query expansion. In Proceedings CIKM, pages 696--703, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. J. Guo, G. Xu, H. Li, and X. Cheng. A unified and discriminative model for query refinement. In Proceedings SIGIR, pages 379--386, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. J. Gao, W. Yuan, X. Li, K. Deng and J.Y. Nie.Smoothing Clickthrough Data for Web Search Ranking. In Proceedings of SIGIR, pages 355--362, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. S. Gollapudi, S. Ieong, A. Ntoulas and S. Paparizos. Efficient query rewrite for structured web queries.In Proceedings of CIKM, pages 2417--2420, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. K. Jarvelin and J. Kekalainen. Cumulated gain-based evaluation of IR techniques. In ACM Transactions on Information Systems, 20(4):422--446, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. T. Joachims. Optimizing search engines using clickthrough data. In Proceedings of KDD, pages 133--142, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. R. Jones and D.C. Fain. Query Word Deletion Prediction. In Proceedings of SIGIR, pages 435--436, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Jones, B. Rey and O. Madani. Generating query Substitutions. In Proceedings of WWW, pages 387--396, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. R. Jones, and K.L. Klinkner. Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs. In Proceedings of CIKM, pages 699--708, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. M. Li, M. Zhu, Y. Zhang, and M. Zhou. Exploring distributional similarity based models for query spelling correction. In Proceedings of COLING-ACL, pages 1025--1032, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Q. Mei, D. Zhou and K. Church. Query Suggestion Using Hitting Time. In Proceedings of CIKM, pages 469--477, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. F.J. Och and N. Hermann. A systematic comparison of various statistical alignment models. In Journal of Computational Linguistics, 29(1):19--51, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. F.J. Och and N. Hermann. The alignment template approach to statistical machine translation. In Journal of Computational Linguistics, 30(4):417--449, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. D. Panigrahi and S. Gollapudi. Result enrichment in commerce search using browse trails. In Proceedings of WSDM, pages 267--276, 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. F. Peng, N. Ahmed, X. Li, and Y. Lu. Context sensitive stemming for web search. In Proceedings of SIGIR, pages 639--646, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. F. Radlinski and T. Joachims.Query Chains: Learning to Rank from Implicit Feedback. In Proceedings of KDD, pages 239--248, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. S. Riezler and Y. Liu. Query Rewriting Using Monolingual Statistical Machine Translation. In Association for Computational Linguistics, 36(3):569--582, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. A. Spink, B.J. Jansen, and H. C. Ozmultu. Use of Query Reformulation and Relevance Feedback by Excite Users. In Internet Research: Electronic Networking Applications and Policy, 10(4):317--328, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  27. X. Wang and C. Zhai. Mining term association patterns from search logs for effective query reformulation. In Proceedings of CIKM, pages 479--488, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. J. Wen, J.Y. Nie and H.J. Zhang. Clustering User Queries of a Search Engine. In Proceedings of WWW, pages 162--168, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Domain dependent query reformulation for web search

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CIKM '12: Proceedings of the 21st ACM international conference on Information and knowledge management
      October 2012
      2840 pages
      ISBN:9781450311564
      DOI:10.1145/2396761

      Copyright © 2012 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 29 October 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate1,861of8,427submissions,22%

      Upcoming Conference

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader