skip to main content
10.1145/1390334.1390399acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

A few examples go a long way: constructing query models from elaborate query formulations

Published:20 July 2008Publication History

ABSTRACT

We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a short query (of a few keywords) together with examples of key reference pages. Given this setup, we investigate how the examples can be utilized to improve the end-to-end performance on the document retrieval task. Our approach is based on a language modeling framework, where the query model is modified to resemble the example pages. We compare several methods for sampling expansion terms from the example pages to support query-dependent and query-independent query expansion; the latter is motivated by the wish to increase "aspect recall", and attempts to uncover aspects of the information need not captured by the query.

For evaluation purposes we use the CSIRO data set created for the TREC 2007 Enterprise track. The best performance is achieved by query models based on query-independent sampling of expansion terms from the example documents.

References

  1. P. Bailey, D. Agrawal, and A. Kumar. TREC 2007 Enterprise Track at CSIRO. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  2. P. Bailey, N. Craswell, A. P. De Vries, and I. Soboroff. Overview of the TREC 2007 Enterprise Track. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  3. P. Bailey, N. Craswell, N. Soboroff, and A. de Vries. The CSIRO enterprise search test collection. ACM SIGIR Forum, 41, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. K. Balog, K. Hofmann, W. Weerkamp, and M. de Rijke. The University of Amsterdam at the TREC 2007 Enterprise Track. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  5. C. Buckley. Why current IR engines fail. In SIGIR '04, pages 584--585, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Y. Fu, Y. Xue, T. Zhu, Y. Liu, M. Zhang, and S. Ma. THUIR at TREC 2007: Enterprise Track. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  7. D. Hannah, C. Macdonald, J. Peng, B. He, and I. Ounis. University of Glasgow at TREC 2007: Experiments in Blog and Enterprise Tracks with Terrier. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  8. D. Harman and C. Buckley. The NRRC reliable information access (RIA) workshop. In SIGIR '04, pages 528--529, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. D. Hiemstra. Using Language Models for Information Retrieval. PhD thesis, University of Twente, 2001.Google ScholarGoogle Scholar
  10. D. Hiemstra, S. Robertson, and H. Zaragoza. Parsimonious language models for information retrieval. In SIGIR '04, pages 178--185, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. H. Joshi, S. D. Sudarsan, S. Duttachowdhury, C. Zhang, and S. Ramasway. UALR at TREC-ENT 2007. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  12. O. Kurland, L. Lee, and C. Domshlak. Better than the real thing? In SIGIR '05, pages 19--26, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. J. Lafferty and C. Zhai. Probabilistic relevance models based on document and query generation. In Language Modeling for Information Retrieval. Springer, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  14. J. Lafferty and C. Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. V. Lavrenko and W. B. Croft. Relevance based language models. In SIGIR '01, pages 120--127, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. D. Miller, T. Leek, and R. Schwartz. A hidden Markov model information retrieval system. In SIGIR '99, pages 214--221, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. J. M. Ponte and W. B. Croft. A language modeling approach to information retrieval. In SIGIR '98, pages 275--281, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Y. Qiu and H.-P. Frei. Concept based query expansion. In SIGIR '93, pages 160--169, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. J. Rocchio. Relevance feedback in information retrieval. In The SMART Retrieval System: Experiments in Automatic Document Processing. Prentice Hall, 1971.Google ScholarGoogle Scholar
  20. G. Salton and M. J. McGill. Introduction to Modern Information Retrieval. McGraw-Hill, Inc., 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. H. Shen, G. Chen, H. Chen, Y. Liu, and X. Cheng. Research on Enterprise Track of TREC 2007. In TREC 2007 Working Notes, 2007.Google ScholarGoogle Scholar
  22. F. Song and W. B. Croft. A general language model for information retrieval. In CIKM '99, pages 316--321, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. K. Sparck Jones, S. E. Robertson, D. Hiemstra, and H. Zaragoza. Language modelling and relevance. InW. B. Croft and J. Lafferty, editors, Language Modeling for Information Retrieval, pages 57--71. 2003.Google ScholarGoogle ScholarCross RefCross Ref
  24. T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. In SIGIR '06, pages 162--169, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. R. Yan and A. Hauptmann. Query expansion using probabilistic local feedback with application to multimedia retrieval. In CIKM '07, pages 361--370, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM '01, pages 403--410. ACM, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. A few examples go a long way: constructing query models from elaborate query formulations

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
          July 2008
          934 pages
          ISBN:9781605581644
          DOI:10.1145/1390334

          Copyright © 2008 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 20 July 2008

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article

          Acceptance Rates

          Overall Acceptance Rate792of3,983submissions,20%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader