skip to main content
10.1145/3357384.3358087acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

Cluster-Based Focused Retrieval

Authors Info & Claims
Published:03 November 2019Publication History

ABSTRACT

The focused retrieval task is to rank documents' passages by their presumed relevance to a query. Inspired by work on cluster-based document retrieval, we present a novel cluster-based focused retrieval method. The method is based on ranking clusters of similar passages using a learning-to-rank approach and transforming the cluster ranking to passage ranking. Empirical evaluation demonstrates the clear merits of the method.

References

  1. Paavo Arvola, Shlomo Geva, Jaap Kamps, Ralf Schenkel, Andrew Trotman, and Johanna Vainio. 2011. Overview of the INEX 2010 ad hoc track. In Comparative Evaluation of Focused Retrieval. 1--32.Google ScholarGoogle Scholar
  2. Michael Bendersky, W Bruce Croft, and Yanlei Diao. 2011. Quality-biased ranking of web documents. In Proc. of WSDM. 95--104.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. David Buffoni, Nicolas Usunier, and Patrick Gallinari. 2010. Lip6 at INEX: OWPC for ad hoc track. In Focused Retrieval and Evaluation . 59--69.Google ScholarGoogle Scholar
  4. James P. Callan. 1994. Passage-Level Evidence in Document Retrieval. In Proc. of SIGIR. 302--301.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Ruey-Cheng Chen, Evi Yulianti, Mark Sanderson, and W Bruce Croft. 2017. On the Benefit of Incorporating External Features in a Neural Architecture for Answer Sentence Selection. In Proc. of SIGIR. 1017--1020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Daniel Cohen and W Bruce Croft. 2016. End to end long short term memory networks for non-factoid question answering. In Proc. of ICTIR. 143--146.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Shlomo Geva, Jaap Kamps, Miro Lethonen, Ralf Schenkel, James A Thom, and Andrew Trotman. 2010. Overview of the INEX 2009 ad hoc track. In Focused retrieval and evaluation . 4--25.Google ScholarGoogle Scholar
  8. Nick Jardine and C. J. van Rijsbergen. 1971. The use of hierarchic clustering in information retrieval. Information storage and retrieval , Vol. 7, 5 (1971), 217--240.Google ScholarGoogle Scholar
  9. Thorsten Joachims. 2006. Training linear SVMs in linear time. In Proc. of KDD. 217--226.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Oren Kurland. 2009. Re-ranking search results using language models of query-specific clusters. Information Retrieval , Vol. 12, 4 (2009), 437--460.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. Oren Kurland and Carmel Domshlak. 2008. A rank-aggregation approach to searching for optimal query-specific clusters. In Proc. of SIGIR. 547--554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Oren Kurland and Eyal Krikon. 2011. The opposite of smoothing: a language model approach to ranking query-specific document clusters. Journal of Artificial Intelligence Research , Vol. 41 (2011), 367--395.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Oren Kurland and Lillian Lee. 2004. Corpus structure, language models, and ad hoc information retrieval. In Proc. of SIGIR . 194--201.Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Xiaoyong Liu and W Bruce Croft. 2004. Cluster-based retrieval using language models. In Proc. of SIGIR. 186--193.Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Xiaoyong Liu and W Bruce Croft. 2006. Experiments on retrieval of optimal clusters . Technical Report. Technical Report IR-478, Center for Intelligent Information Retrieval (CIIR), University of Massachusetts.Google ScholarGoogle Scholar
  16. Xiaoyong Liu and W Bruce Croft. 2008. Evaluating text representations for retrieval of the best group of documents. In Proc. of ECIR . 454--462.Google ScholarGoogle ScholarCross RefCross Ref
  17. Vanessa Graham Murdock. 2006. Aspects of sentence retrieval . Ph.D. Dissertation. University of Massachusetts Amherst.Google ScholarGoogle Scholar
  18. Fiana Raiber and Oren Kurland. 2013. Ranking document clusters using markov random fields. In Proc. of SIGIR . 333--342.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Tetsuya Sakai, Toshihiko Manabe, and Makoto Koyama. 2005. Flexible pseudo-relevance feedback via selective sampling. TALIP , Vol. 4, 2 (2005), 111--135.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Gerard Salton, James Allan, and Chris Buckley. 1993. Approaches to passage retrieval in full text information systems. In Proc. of SIGIR . 49--58.Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Aliaksei Severyn and Alessandro Moschitti. 2015. Learning to rank short text pairs with convolutional deep neural networks. In Proc. of SIGIR. 373--382.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Eilon Sheetrit, Anna Shtok, Oren Kurland, and Igal Shprincis. 2018. Testing the Cluster Hypothesis with Focused and Graded Relevance Judgments. In Proc. of SIGIR . 1173--1176.Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. Ian Soboroff. 2004. Overview of the TREC 2004 Novelty Track. In Proc. of TREC .Google ScholarGoogle Scholar
  24. Ian Soboroff and Donna Harman. 2003. Overview of the TREC 2003 Novelty Track. In Proc. of TREC. 38--53.Google ScholarGoogle Scholar
  25. Anastasios Tombros, Robert Villa, and C. J. Van Rijsbergen. 2002. The effectiveness of query-specific hierarchic clustering in information retrieval. Information processing & management , Vol. 38, 4 (2002), 559--582.Google ScholarGoogle Scholar
  26. Ellen M. Voorhees. 1985. The cluster hypothesis revisited. In Proc. of SIGIR. 188--196.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Liu Yang, Qingyao Ai, Jiafeng Guo, and W Bruce Croft. 2016a. aNMM: Ranking short answer texts with attention-based neural matching model. In Proc. of CIKM. 287--296.Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Liu Yang, Qingyao Ai, Damiano Spina, Ruey-Cheng Chen, Liang Pang, W Bruce Croft, Jiafeng Guo, and Falk Scholer. 2016b. Beyond factoid QA: Effective methods for non-factoid answer sentence retrieval. In Proc. of ECIR. 115--128.Google ScholarGoogle ScholarCross RefCross Ref
  29. Chengxiang Zhai and John Lafferty. 2001. A study of smoothing methods for language models applied to ad hoc information retrieval. In Proc. of SIGIR. 334--342.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Cluster-Based Focused Retrieval

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management
          November 2019
          3373 pages
          ISBN:9781450369763
          DOI:10.1145/3357384

          Copyright © 2019 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 3 November 2019

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • short-paper

          Acceptance Rates

          CIKM '19 Paper Acceptance Rate202of1,031submissions,20%Overall Acceptance Rate1,861of8,427submissions,22%

          Upcoming Conference

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader