research-article

Positional relevance model for pseudo-relevance feedback

Authors:

ChengXiang ZhaiAuthors Info & Claims

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

Pages 579 - 586

https://doi.org/10.1145/1835449.1835546

Published: 19 July 2010 Publication History

Abstract

Pseudo-relevance feedback is an effective technique for improving retrieval results. Traditional feedback algorithms use a whole feedback document as a unit to extract words for query expansion, which is not optimal as a document may cover several different topics and thus contain much irrelevant information. In this paper, we study how to effectively select from feedback documents those words that are focused on the query topic based on positions of terms in feedback documents. We propose a positional relevance model (PRM) to address this problem in a unified probabilistic way. The proposed PRM is an extension of the relevance model to exploit term positions and proximity so as to assign more weights to words closer to query words based on the intuition that words closer to query words are more likely to be related to the query topic. We develop two methods to estimate PRM based on different sampling processes. Experiment results on two large retrieval datasets show that the proposed PRM is effective and robust for pseudo-relevance feedback, significantly outperforming the relevance model in both document-based feedback and passage-based feedback.

References

[1]

Nasreen Abdul-Jaleel, James Allan, W. Bruce Croft, Fernando Diaz, Leah Larkey, Xiaoyan Li, Donald Metzler, Mark D. Smucker, Trevor Strohman, Howard Turtle, and Courtney Wade. Umass at trec 2004: Novelty and hard. In TREC '04, 2004.

[2]

James Allan. Relevance feedback with too much data. In SIGIR '95, pages 337--343, 1995.

Digital Library

[3]

Chris Buckley, Gerard Salton, James Allan, and Amit Singhal. Automatic query expansion using smart: Trec 3. In TREC '94, pages 69--80, 1994.

[4]

Georg Buscher, Andreas Dengel, and Ludger van Elst. Query expansion using gaze-based feedback on the subdocument level. In SIGIR '08, pages 387--394, 2008.

Digital Library

[5]

Stefan Buttcher and Charles L. A. Clarke. Efficiency vs. effectiveness in terabyte-scale information retrieval. In TREC '05, 2005.

[6]

Stefan Buttcher, Charles L. A. Clarke, and Brad Lushman. Term proximity scoring for ad-hoc retrieval on very large text collections. In SIGIR '06, pages 621--622, 2006.

Digital Library

[7]

Guihong Cao, Jian-Yun Nie, Jianfeng Gao, and Stephen Robertson. Selecting good expansion terms for pseudo-relevance feedback. In SIGIR, pages 243--250, 2008.

Digital Library

[8]

Ben Carterette, James Allan, and Ramesh Sitaraman. Minimal test collections for retrieval evaluation. In SIGIR '06, pages 268--275, 2006.

Digital Library

[9]

Charles L. A. Clarke, Gordon V. Cormack, and Forbes J. Burkowski. Shortest substring ranking (multitext experiments for trec-4). In TREC '95, pages 295--304, 1995.

[10]

Ronan Cummins and Colm O'Riordan. Learning in a pairwise term-term proximity framework for information retrieval. In SIGIR '09, pages 251--258, 2009.

Digital Library

[11]

David Hawking and Paul B. Thistlewaite. Proximity operators - so near and yet so far. In TREC '95, pages 500--236, 1995.

[12]

Marcin Kaszkiel and Justin Zobel. Effective ranking with arbitrary passages. Journal of the American Society for Information Science and Technology, 52(4):344--364, 2001.

[13]

E. Michael Keen. The use of term position devices in ranked output experiments. The Journal of Documentation, 47(1):1--22, 1991.

Digital Library

[14]

E. Michael Keen. Some aspects of proximity searching in text retrieval systems. Journal of Information Science, 18(2):89--98, 1992.

Digital Library

[15]

John D. Lafferty and Chengxiang Zhai. Document language models, query models, and risk minimization for information retrieval. In SIGIR '01, pages 111--119, 2001.

Digital Library

[16]

Victor Lavrenko and W. Bruce Croft. Relevance-based language models. In SIGIR '01, pages 120--127, 2001.

Digital Library

[17]

Xiaoyong Liu and W. Bruce Croft. Passage retrieval based on language models. In CIKM '02, pages 375--382, 2002.

Digital Library

[18]

Yuanhua Lv and ChengXiang Zhai. A comparative study of methods for estimating query language models with pseudo feedback. In CIKM '09, pages 1895--1898, 2009.

Digital Library

[19]

Yuanhua Lv and ChengXiang Zhai. Positional language models for information retrieval. In SIGIR '09, pages 299--306, 2009.

Digital Library

[20]

Donald Metzler and W. Bruce Croft. A markov random field model for term dependencies. In SIGIR '05, pages 472--479, 2005.

Digital Library

[21]

Donald Metzler and W. Bruce Croft. Latent concept expansion using markov random fields. In SIGIR '07, pages 311--318, 2007.

Digital Library

[22]

Christof Monz. Minimal span weighting retrieval for question answering. In Rob Gaizauskas, Mark Greenwood, and Mark Hepple, editors, SIGIR Workshop on Information Retrieval for Question Answering, pages 23--30, 2004.

[23]

Yves Rasolofo and Jacques Savoy. Term proximity scoring for keyword-based retrieval systems. In ECIR '03, pages 207--218, 2003.

Digital Library

[24]

Stephen E. Robertson and Karen Sparck Jones. Relevance weighting of search terms. Journal of the American Society of Information Science, 27(3):129--146, 1976.

[25]

Stephen E. Robertson, Steve Walker, Susan Jones, Micheline Hancock-Beaulieu, and Mike Gatford. Okapi at trec-3. In TREC '94, pages 109--126, 1994.

[26]

J. J. Rocchio. Relevance feedback in information retrieval. In In The SMART Retrieval System: Experiments in Automatic Document Processing, pages 313--323. Prentice-Hall Inc., 1971.

[27]

Gerard Salton and Chris Buckley. Improving retrieval performance by relevance feedback. Journal of the American Society of Information Science, 41(4):288--297, 1990.

[28]

Tao Tao and ChengXiang Zhai. An exploration of proximity measures in information retrieval. In SIGIR '07, pages 295--302, 2007.

Digital Library

[29]

Olga Vechtomova and Ying Wang. A study of the effect of term proximity on query expansion. Journal of Information Science, 32(4):324--333, August 2006.

[30]

Jinxi Xu and W. Bruce Croft. Query expansion using local and global document analysis. In SIGIR '96, pages 4--11, 1996.

Digital Library

[31]

Shipeng Yu, Deng Cai, Ji-Rong Wen, and Wei-Ying Ma. Improving pseudo-relevance feedback in web information retrieval using web page segmentation. In WWW '03, pages 11--18, 2003.

Digital Library

[32]

ChengXiang Zhai and John D. Lafferty. Model-based feedback in the language modeling approach to information retrieval. In CIKM '01, pages 403--410, 2001.

Digital Library

[33]

ChengXiang Zhai and John D. Lafferty. A study of smoothing methods for language models applied to ad hoc information retrieval. In SIGIR '01, pages 334--342, 2001.

Digital Library

[34]

Jinglei Zhao and Yeogirl Yun. A proximity language model for information retrieval. In SIGIR '09, pages 291--298, 2009.

Digital Library

Cited By

Liu WZhou YZhu YDou Z(2024)How to personalize and whether to personalize? Candidate documents decideKnowledge and Information Systems10.1007/s10115-024-02138-y66:9(5581-5604)Online publication date: 27-May-2024
https://doi.org/10.1007/s10115-024-02138-y
Abdollahi SKuculo TGottschalk S(2024)Event-Specific Document Ranking Through Multi-stage Query Expansion Using an Event Knowledge GraphAdvances in Information Retrieval10.1007/978-3-031-56060-6_22(333-348)Online publication date: 16-Mar-2024
https://doi.org/10.1007/978-3-031-56060-6_22
Bassani ETonellotto NPasi G(2023)Personalized Query Expansion with Contextual Word EmbeddingsACM Transactions on Information Systems10.1145/362498842:2(1-35)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3624988
Show More Cited By

Index Terms

Positional relevance model for pseudo-relevance feedback
1. Information systems
  1. Information retrieval

Recommendations

Query dependent pseudo-relevance feedback based on wikipedia
SIGIR '09: Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval

Pseudo-relevance feedback (PRF) via query-expansion has been proven to be e®ective in many information retrieval (IR) tasks. In most existing work, the top-ranked documents from an initial search are assumed to be relevant and used for PRF. One problem ...
Pseudo relevance feedback using semantic clustering in relevance language model
CIKM '09: Proceedings of the 18th ACM conference on Information and knowledge management

Pseudo relevance feedback has demonstrated to be in general an effective technique for improving retrieval effectiveness, but the noise in the top retrieved documents still can cause topic drift problem that affects the performance of certain topics. By ...
Relevance Feedback Fusion via Query Expansion
WI-IAT '12: Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 03

Relevance Feedback (RF) is an important technique to improve information retrieval and has emerged as one of the hottest topics for both the industry and academic researchers. The performance of RF depends on feedback information. As the volume of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '10: Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval

July 2010

944 pages

ISBN:9781450301534

DOI:10.1145/1835449

General Chairs:
Fabio Crestani
University of Lugano, CH
,
Stéphane Marchand-Maillet
University of Geneva, CH
,
Program Chairs:
Hsin-Hsi Chen
National Taiwan University, TW
,
Efthimis N. Efthimiadis
University of Washington, USA
,
Jacques Savoy
University of Neuchatel, CH

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR '10

Sponsor:

SIGIR

SIGIR '10: The 33rd International ACM SIGIR conference on research and development in Information Retrieval

July 19 - 23, 2010

Geneva, Switzerland

Acceptance Rates

SIGIR '10 Paper Acceptance Rate 87 of 520 submissions, 17%;

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

124
Total Citations
View Citations
1,241
Total Downloads

Downloads (Last 12 months)25
Downloads (Last 6 weeks)2

Reflects downloads up to 20 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu WZhou YZhu YDou Z(2024)How to personalize and whether to personalize? Candidate documents decideKnowledge and Information Systems10.1007/s10115-024-02138-y66:9(5581-5604)Online publication date: 27-May-2024
https://doi.org/10.1007/s10115-024-02138-y
Abdollahi SKuculo TGottschalk S(2024)Event-Specific Document Ranking Through Multi-stage Query Expansion Using an Event Knowledge GraphAdvances in Information Retrieval10.1007/978-3-031-56060-6_22(333-348)Online publication date: 16-Mar-2024
https://doi.org/10.1007/978-3-031-56060-6_22
Bassani ETonellotto NPasi G(2023)Personalized Query Expansion with Contextual Word EmbeddingsACM Transactions on Information Systems10.1145/362498842:2(1-35)Online publication date: 20-Sep-2023
https://dl.acm.org/doi/10.1145/3624988
Li HMourad AZhuang SKoopman BZuccon G(2023)Pseudo Relevance Feedback with Deep Language Models and Dense Retrievers: Successes and PitfallsACM Transactions on Information Systems10.1145/357072441:3(1-40)Online publication date: 10-Apr-2023
https://dl.acm.org/doi/10.1145/3570724
Jaenich TMcDonald GOunis I(2023)ColBERT-FairPRF: Towards Fair Pseudo-Relevance Feedback in Dense RetrievalAdvances in Information Retrieval10.1007/978-3-031-28238-6_36(457-465)Online publication date: 17-Mar-2023
https://doi.org/10.1007/978-3-031-28238-6_36
Chen ZWang JYang X(2022)A Concept Net-based semantic constraint method for query expansion2022 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT)10.1109/WI-IAT55865.2022.00147(906-913)Online publication date: Nov-2022
https://doi.org/10.1109/WI-IAT55865.2022.00147
Feng JZhao RJiang J(2022)A Large Scale Document-Term Matching Method Based on Information Retrieval2022 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom)10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00048(323-330)Online publication date: Dec-2022
https://doi.org/10.1109/ISPA-BDCloud-SocialCom-SustainCom57177.2022.00048
Pan MWang JHuang JHuang AChen QChen J(2022)A probabilistic framework for integrating sentence-level semantics via BERT into pseudo-relevance feedbackInformation Processing and Management: an International Journal10.1016/j.ipm.2021.10273459:1Online publication date: 9-Apr-2022
https://dl.acm.org/doi/10.1016/j.ipm.2021.102734
Faggioli GFerrante MFerro NPerego RTonellotto N(2022)A Dependency-Aware Utterances Permutation Strategy to Improve Conversational EvaluationAdvances in Information Retrieval10.1007/978-3-030-99736-6_13(184-198)Online publication date: 5-Apr-2022
https://doi.org/10.1007/978-3-030-99736-6_13
Qadeer MHussain CHussain C(2022)Biomedical Data Retrieval Using Enhanced Query ExpansionHandbook of Smart Materials, Technologies, and Devices10.1007/978-3-030-84205-5_63(1921-1956)Online publication date: 10-Nov-2022
https://doi.org/10.1007/978-3-030-84205-5_63
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents