skip to main content
10.1145/2063576.2063586acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Improving retrieval accuracy of difficult queries through generalizing negative document language models

Published: 24 October 2011 Publication History

Abstract

When a query topic is difficult and the search results are very poor, negative feedback is a very useful method to improve the retrieval accuracy and user experience. One challenge in negative feedback is that negative documents tend to be distracting in different ways, thus as training examples, negative examples are sparse. In this paper, we solve the problem of data sparseness in the language modeling framework. We propose an optimization framework, in which we learn from a few top-ranked non-relevant examples, and search in a large space of all language models to build a more general negative language model. This general negative language model has more power in pruning the non-relevant documents, thus potentially improving the performance for difficult queries. Experiment results on representative TREC collections show that the proposed optimization framework can improve negative feedback performance over the state-of-the-art negative feedback method through generalizing negative language models.

References

[1]
R. Attar and A. S. Fraenkel. Local feedback in full-text retrieval systems. J. ACM, pages 24(3):397--417, 1977.
[2]
J. Bai, D. Song, P. Bruza, J. Y. Nie, and G. Cao. Query expansion using term relationships in language models for information retrieval. ACM CIKM, pages 688--695, 2005.
[3]
C. Buckeley, G. Salton, J. Allan, and A. Singhal. Automatic query expansion using smart: Trec3. TREC94, pages 69--80, 1994.
[4]
C. Buckley. Why current ir engines fail. ACM SIGIR, pages 584--585, 2004.
[5]
G. Cao, J. Y. Nie, and J. Bai. Integrating word relationships into language models. ACM SIGIR, pages 298--305, 2005.
[6]
D. Carmel, E. Yom-Tov, A. Darlow, and D. Pelleg. What makes a query difficult? ACM SIGIR, pages 390--397, 2006.
[7]
K. Collins-Thompson. Estimating robust query models using convex optimization. NIPS, 2008.
[8]
K. Collins-Thompson. Reducing the risk of query expansion via robust constrained optimization. ACM CIKM, pages 837--846, 2009.
[9]
K. Collins-Thompson and J. Callan. Query expansion using random walk models. ACM CIKM, pages 704--711, 2005.
[10]
T. Cover and J. Thomas. Elements of information theory. John Wiley and Sons, New York, USA, 1991.
[11]
S. Cronen-Townsend, Y. Zhou, and W. B. Croft. Predicting query performance. ACM SIGIR, pages 299--306, 2002.
[12]
J. V. Dillon and K. Collins-Thompson. A unified optimization framework for robust pseudo-relevance feedback algorithms. ACM CIKM, pages 1069--1078, 2010.
[13]
T. Ellman. Explanation-based learning: a survey of programs and perspectives. ACM Comput. Surv., 21:163--221, June 1989.
[14]
D. Harman. Relevance feedback revisited. In Proceedings of ACM SIGIR 1992, pages 1--10, 1992.
[15]
D. Harman and C. Buckley. Sigir 2004 workshop: Ria and where can ir go from here. SIGIR Forum, pages 38(2):45--49, 2004.
[16]
Y. Jing and B. Croft. An association thesaurus for information retrieval. RIAO, pages 141--160, 1994.
[17]
M. Karimzadehgan and C. Zhai. Estimation of statistical translation models based on mutual information for ad hoc information retrieval. ACM SIGIR, pages 323--330, 2010.
[18]
M. Karimzadehgan and C. Zhai. Exploration-exploitation tradeoff in interactive relevance feedback. ACM CIKM, pages 1397--1400, 2010.
[19]
J. Laffetry and C. Zhai. Document language models, query models, and risk minimization for information retrieval. ACM SIGIR, pages 111--119, 2001.
[20]
V. Lavrenko and W. B. Croft. Relevance-based language models. SIGIR, pages 120--127, 2001.
[21]
M. Lesk and B. Croft. Word-word associations in document retrieval systems. American Documentation, 20:20--27, 1969.
[22]
S. Liu, F. Lin, C. Yu, and W. Meng. An effective approach to document retrieval via utilizing wordnet and recognizing phrases. ACM SIGIR, pages 266--272, 2004.
[23]
R. Mandala, T. tokunaga, H. Tanaka, and K. Satoh. Ad hoc retrieval experiments using wordnet and automatically constructed thesauri. TREC-7, pages 475--481, 1998.
[24]
H. J. Peat and P. Willett. The limitations of term co-occurrence data for query expansion in document retrieval systems. J. of Information science, 42(5):378--383, 1991.
[25]
J. Ponte and W. Croft. A language modeling approach to information retrieval. ACM SIGIR, 1998.
[26]
Y. Qiu and H. Frei. Concept based query expansion. ACM SIGIR, pages 160--169, 1993.
[27]
K. Raman, R. Udupa, P. Bhattacharya, and A. Bhole. On improving pseudo-relevance feedback using pseudo-irrelevant documents. LNCS, pages 573--576, 2010.
[28]
C. J. V. Rijsbergen. Information retrieval. Butterworths, 1979.
[29]
S. E. Robertson and K. S. Jones. Relevance weighting of search terms. Journal of the American Society of Information Science, pages 27(3):129--146, 1976.
[30]
J. Rocchio. Relevance feedback in information retrieval. In the SMART Retrieval System, pages 313--323, 1971.
[31]
G. Salton and C. Buckley. Improving retrieval performance by relevance feedback. Journal of Information Science, pages 41(4):288--297, 1990.
[32]
H. Schutze and J. O. Pedersen. A co-occurrence based thesaurus and two applications to information retrieval. Information and processing management, 33(3):307--318, 1997.
[33]
X. Shen, B. Tan, and C. Zhai. Context-sensitive information retrieval using implicit feedback. ACM SIGIR, pages 43--50, 2005.
[34]
A. Singhal, M. Mitra, and C. Buckley. Learning routing queries in a query zone. ACM SIGIR, pages 25--32, 1997.
[35]
A. F. Smeaton and C. J. V. Rijsbergen. The retrieval effects of query expansion on a feedback document retrieval system. The Computer Journal, 26(3):239--246, 1983.
[36]
T. Tao and C. Zhai. Regularized estimation of mixture models for robust pseudo-relevance feedback. ACM SIGIR, pages 162--169, 2006.
[37]
E. M. Voorhees. Overview of the trec 2004 robust retrieval track. TREC, 2004.
[38]
E. M. Voorhees. Draft: Overview of the trec 2005 robust retrieval track. Notebook of TREC2005, 2005.
[39]
E. M. Voorhess. Query expansion using lexical-semantic relations. ACM SIGIR, pages 61--69, 1994.
[40]
X. Wang, H. Fang, and C. Zhai. Improve retrieval accuracy for difficult queries using negative feedback. ACM CIKM, pages 991--994, 2007.
[41]
X. Wang, H. Fang, and C. Zhai. A study of methods for negative relevance feedback. ACM SIGIR, pages 219--226, 2008.
[42]
J. Xu and W. B. Croft. Query expansion using local and global document analysis. ACM SIGIR, pages 4--11, 1996.
[43]
E. Yom-Tov, S. Fine, D. Carmel, and A. Darlow. Learning to estimate query difficulty: including applications to missing content detection and distributed information retrieval. ACM SIGIR, pages 512--519, 2005.
[44]
C. Zhai and J. Lafferty. Model-based feedback in the language modeling approach to information retrieval. ACM CIKM, pages 403--410, 2001.
[45]
C. Zhai and J. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. ACM SIGIR, pages 334--342, 2001.

Cited By

View all
  • (2022)Learning Relevant Questions for Conversational Product Search using Deep Reinforcement LearningProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498526(746-754)Online publication date: 11-Feb-2022
  • (2021)Asking Clarifying Questions Based on Negative Feedback in Conversational SearchProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472232(157-166)Online publication date: 11-Jul-2021
  • (2019)Conversational Product Search Based on Negative FeedbackProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357939(359-368)Online publication date: 3-Nov-2019
  • Show More Cited By

Index Terms

  1. Improving retrieval accuracy of difficult queries through generalizing negative document language models

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '11: Proceedings of the 20th ACM international conference on Information and knowledge management
    October 2011
    2712 pages
    ISBN:9781450307178
    DOI:10.1145/2063576
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. difficult topics
    2. generalizing language model
    3. language models
    4. negative feedback
    5. optimization

    Qualifiers

    • Research-article

    Conference

    CIKM '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Learning Relevant Questions for Conversational Product Search using Deep Reinforcement LearningProceedings of the Fifteenth ACM International Conference on Web Search and Data Mining10.1145/3488560.3498526(746-754)Online publication date: 11-Feb-2022
    • (2021)Asking Clarifying Questions Based on Negative Feedback in Conversational SearchProceedings of the 2021 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3471158.3472232(157-166)Online publication date: 11-Jul-2021
    • (2019)Conversational Product Search Based on Negative FeedbackProceedings of the 28th ACM International Conference on Information and Knowledge Management10.1145/3357384.3357939(359-368)Online publication date: 3-Nov-2019
    • (2019)Improving Content-based Audio Retrieval by Vocal Imitation FeedbackICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2019.8683461(4100-4104)Online publication date: May-2019
    • (2018)Interactive Spoken Content Retrieval by Deep Reinforcement LearningIEEE/ACM Transactions on Audio, Speech and Language Processing10.1109/TASLP.2018.285273926:12(2447-2459)Online publication date: 1-Dec-2018
    • (2017)Negative Relevance Feedback for Exploratory Search with Visual Interactive Intent ModelingProceedings of the 22nd International Conference on Intelligent User Interfaces10.1145/3025171.3025222(149-159)Online publication date: 7-Mar-2017
    • (2017)A Distribution Separation Method Using Irrelevance Feedback Data for Information RetrievalACM Transactions on Intelligent Systems and Technology10.1145/29946088:3(1-26)Online publication date: 12-Jan-2017
    • (2016)Utilizing Focused Relevance FeedbackProceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval10.1145/2911451.2914695(1061-1064)Online publication date: 7-Jul-2016
    • (2016)Sub-event discovery and retrieval during natural hazards on social media dataWorld Wide Web10.1007/s11280-015-0359-819:2(277-297)Online publication date: 1-Mar-2016
    • (2015)adaQACProceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/2766462.2767697(143-152)Online publication date: 9-Aug-2015
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media