short-paper

Cluster-Based Focused Retrieval

Authors:
Eilon Sheetrit

Technion - Israel Institute of Technology, Haifa, Israel

Technion - Israel Institute of Technology, Haifa, Israel
View Profile

,
Oren Kurland

Technion - Israel Institute of Technology, Haifa, Israel

Technion - Israel Institute of Technology, Haifa, Israel
View Profile

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge ManagementNovember 2019Pages 2305–2308https://doi.org/10.1145/3357384.3358087

Published:03 November 2019Publication History

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 2305–2308

ABSTRACT

The focused retrieval task is to rank documents' passages by their presumed relevance to a query. Inspired by work on cluster-based document retrieval, we present a novel cluster-based focused retrieval method. The method is based on ranking clusters of similar passages using a learning-to-rank approach and transforming the cluster ranking to passage ranking. Empirical evaluation demonstrates the clear merits of the method.

References

Paavo Arvola, Shlomo Geva, Jaap Kamps, Ralf Schenkel, Andrew Trotman, and Johanna Vainio. 2011. Overview of the INEX 2010 ad hoc track. In Comparative Evaluation of Focused Retrieval. 1--32.Google Scholar
Michael Bendersky, W Bruce Croft, and Yanlei Diao. 2011. Quality-biased ranking of web documents. In Proc. of WSDM. 95--104.Google ScholarDigital Library
David Buffoni, Nicolas Usunier, and Patrick Gallinari. 2010. Lip6 at INEX: OWPC for ad hoc track. In Focused Retrieval and Evaluation . 59--69.Google Scholar
James P. Callan. 1994. Passage-Level Evidence in Document Retrieval. In Proc. of SIGIR. 302--301.Google ScholarDigital Library
Ruey-Cheng Chen, Evi Yulianti, Mark Sanderson, and W Bruce Croft. 2017. On the Benefit of Incorporating External Features in a Neural Architecture for Answer Sentence Selection. In Proc. of SIGIR. 1017--1020.Google ScholarDigital Library
Daniel Cohen and W Bruce Croft. 2016. End to end long short term memory networks for non-factoid question answering. In Proc. of ICTIR. 143--146.Google ScholarDigital Library
Shlomo Geva, Jaap Kamps, Miro Lethonen, Ralf Schenkel, James A Thom, and Andrew Trotman. 2010. Overview of the INEX 2009 ad hoc track. In Focused retrieval and evaluation . 4--25.Google Scholar
Nick Jardine and C. J. van Rijsbergen. 1971. The use of hierarchic clustering in information retrieval. Information storage and retrieval , Vol. 7, 5 (1971), 217--240.Google Scholar
Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. 217--226.Google ScholarDigital Library
Oren Kurland. 2009. Re-ranking search results using language models of query-specific clusters. Information Retrieval , Vol. 12, 4 (2009), 437--460.Google ScholarDigital Library
Oren Kurland and Carmel Domshlak. 2008. A rank-aggregation approach to searching for optimal query-specific clusters. In Proc. of SIGIR. 547--554.Google ScholarDigital Library
Oren Kurland and Eyal Krikon. 2011. The opposite of smoothing: a language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research , Vol. 41 (2011), 367--395.Google ScholarDigital Library
Oren Kurland and Lillian Lee. 2004. Corpus structure, language models, and ad hoc information retrieval. In Proc. of SIGIR . 194--201.Google ScholarDigital Library
Xiaoyong Liu and W Bruce Croft. 2004. Cluster-based retrieval using language models. In Proc. of SIGIR. 186--193.Google ScholarDigital Library
Xiaoyong Liu and W Bruce Croft. 2006. Experiments on retrieval of optimal clusters . Technical Report. Technical Report IR-478, Center for Intelligent Information Retrieval (CIIR), University of Massachusetts.Google Scholar
Xiaoyong Liu and W Bruce Croft. 2008. Evaluating text representations for retrieval of the best group of documents. In Proc. of ECIR . 454--462.Google ScholarCross Ref
Vanessa Graham Murdock. 2006. Aspects of sentence retrieval . Ph.D. Dissertation. University of Massachusetts Amherst.Google Scholar
Fiana Raiber and Oren Kurland. 2013. Ranking document clusters using markov random fields. In Proc. of SIGIR . 333--342.Google ScholarDigital Library
Tetsuya Sakai, Toshihiko Manabe, and Makoto Koyama. 2005. Flexible pseudo-relevance feedback via selective sampling. TALIP , Vol. 4, 2 (2005), 111--135.Google ScholarDigital Library
Gerard Salton, James Allan, and Chris Buckley. 1993. Approaches to passage retrieval in full text information systems. In Proc. of SIGIR . 49--58.Google ScholarDigital Library
Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proc. of SIGIR. 373--382.Google ScholarDigital Library
Eilon Sheetrit, Anna Shtok, Oren Kurland, and Igal Shprincis. 2018. Testing the Cluster Hypothesis with Focused and Graded Relevance Judgments. In Proc. of SIGIR . 1173--1176.Google ScholarDigital Library
Ian Soboroff. 2004. Overview of the TREC 2004 Novelty Track. In Proc. of TREC .Google Scholar
Ian Soboroff and Donna Harman. 2003. Overview of the TREC 2003 Novelty Track. In Proc. of TREC. 38--53.Google Scholar
Anastasios Tombros, Robert Villa, and C. J. Van Rijsbergen. 2002. The effectiveness of query-specific hierarchic clustering in information retrieval. Information processing & management , Vol. 38, 4 (2002), 559--582.Google Scholar
Ellen M. Voorhees. 1985. The cluster hypothesis revisited. In Proc. of SIGIR. 188--196.Google ScholarDigital Library
Liu Yang, Qingyao Ai, Jiafeng Guo, and W Bruce Croft. 2016a. aNMM: Ranking short answer texts with attention-based neural matching model. In Proc. of CIKM. 287--296.Google ScholarDigital Library
Liu Yang, Qingyao Ai, Damiano Spina, Ruey-Cheng Chen, Liang Pang, W Bruce Croft, Jiafeng Guo, and Falk Scholer. 2016b. Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proc. of ECIR. 115--128.Google ScholarCross Ref
Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. of SIGIR. 334--342.Google ScholarDigital Library

Index Terms

Cluster-Based Focused Retrieval
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Learning to rank
      2. Similarity measures
    2. Retrieval tasks and goals
      1. Clustering and classification

Recommendations

From Cluster Ranking to Document Ranking
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

The common approach of using clusters of similar documents for ad hoc document retrieval is to rank the clusters in response to the query; then, the cluster ranking is transformed to document ranking. We present a novel supervised approach to transform ...
Read More
Ranking document clusters using markov random fields
SIGIR '13: Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval

An important challenge in cluster-based document retrieval is ranking document clusters by their relevance to the query. We present a novel cluster ranking approach that utilizes Markov Random Fields (MRFs). MRFs enable the integration of various types ...
Read More
Enhancing relevance models with adaptive passage retrieval
ECIR'08: Proceedings of the IR research, 30th European conference on Advances in information retrieval

Passage retrieval and pseudo relevance feedback/query expansion have been reported as two effective means for improving document retrieval in literature. Relevance models, while improving retrieval in most cases, hurts performance on some heterogeneous ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
November 2019
3373 pages
ISBN:9781450369763
DOI:10.1145/3357384
General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China
Copyright © 2019 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 3 November 2019
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
cluster ranking
focused retrieval
passage retrieval
Qualifiers
- short-paper
Conference

Acceptance Rates
CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%
More
Upcoming Conference
CIKM '24

Sponsor:

sigir

sigir

The 33rd ACM International Conference on Information and Knowledge Management

October 21 - 25, 2024

Boise , ID , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 209
  Total Downloads
- Downloads (Last 12 months)6
- Downloads (Last 6 weeks)2
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Cluster-Based Focused Retrieval

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

ABSTRACT

References

Cited By

Index Terms

Recommendations

From Cluster Ranking to Document Ranking

Ranking document clusters using markov random fields

Enhancing relevance models with adaptive passage retrieval