skip to main content
10.1145/3404835.3462922acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
research-article

TILDE: Term Independent Likelihood moDEl for Passage Re-ranking

Published:11 July 2021Publication History

ABSTRACT

Deep language models (deep LMs) are increasingly being used for full text retrieval or within cascade retrieval pipelines as later-stage re-rankers. A problem with using deep LMs is that, at query time, a slow inference step needs to be performed -- this hinders the practical adoption of these powerful retrieval models, or limits sensibly how many documents can be considered for re-ranking.

We propose the novel, BERT-based, Term Independent Likelihood moDEl (TILDE), which ranks documents by both query and document likelihood. At query time, our model does not require the inference step of deep language models based retrieval approaches, thus providing consistent time-savings, as the prediction of query terms' likelihood can be pre-computed and stored during index creation. This is achieved by relaxing the term dependence assumption made by the deep LMs. In addition, we have devised a novel bi-directional training loss which allows TILDE to maximise both query and document likelihood at the same time during training. At query time, TILDE can rely on its query likelihood component (TILDE-QL) solely, or the combination of TILDE-QL and its document likelihood component (TILDE-DL), thus providing a flexible trade-off between efficiency and effectiveness. Exploiting both components provide the highest effectiveness at a higher computational cost while relying only on TILDE-QL trades off effectiveness for faster response time due to no inference being required.

TILDE is evaluated on the MS MARCO and TREC Deep Learning 2019 and 2020 passage ranking datasets. Empirical results show that, compared to other approaches that aim to make deep language models viable operationally, TILDE achieves competitive effectiveness coupled with low query latency.

Skip Supplemental Material Section

Supplemental Material

TILDE.mp4

mp4

34.3 MB

References

  1. Zeynep Akkalyoncu Yilmaz, Charles LA Clarke, and Jimmy Lin. 2020. A lightweight environment for learning experimental IR research practices. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 2113--2116.Google ScholarGoogle Scholar
  2. Yoshua Bengio, Réjean Ducharme, Pascal Vincent, and Christian Janvin. 2003. A neural probabilistic language model. The journal of machine learning research, Vol. 3 (2003), 1137--1155.Google ScholarGoogle Scholar
  3. Tom B Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020).Google ScholarGoogle Scholar
  4. Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2019. Overview of the trec 2019 deep learning track. In Proceedings of TREC 2020.Google ScholarGoogle Scholar
  5. Nick Craswell, Bhaskar Mitra, Emine Yilmaz, Daniel Campos, and Ellen M Voorhees. 2020. Overview of the trec 2020 deep learning track. In Proceedings of TREC 2020.Google ScholarGoogle Scholar
  6. Zhuyun Dai and Jamie Callan. 2020. Context-aware term weighting for first stage passage retrieval. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1533--1536.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 4171--4186.Google ScholarGoogle Scholar
  8. Cicero dos Santos, Xiaofei Ma, Ramesh Nallapati, Zhiheng Huang, and Bing Xiang. 2020. Beyond [CLS] through Ranking by Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 1722--1727.Google ScholarGoogle ScholarCross RefCross Ref
  9. Norbert Fuhr. 2018. Some common mistakes in IR evaluation, and how they can be avoided. In ACM SIGIR Forum, Vol. 51. ACM New York, NY, USA, 32--41.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Luyu Gao, Zhuyun Dai, Zhen Fan, and Jamie Callan. 2020. Complementing lexical retrieval with semantic residual embedding. arXiv preprint arXiv:2004.13969 (2020).Google ScholarGoogle Scholar
  11. Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, and Qun Liu. 2020. DynaBERT: Dynamic BERT with Adaptive Width and Depth. Advances in Neural Information Processing Systems, Vol. 33 (2020).Google ScholarGoogle Scholar
  12. Omar Khattab and Matei Zaharia. 2020. Colbert: Efficient and effective passage search via contextualized late interaction over bert. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 39--48.Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Diederik P Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.Google ScholarGoogle Scholar
  14. Jimmy Lin, Rodrigo Nogueira, and Andrew Yates. 2020. Pretrained transformers for text ranking: Bert and beyond. arXiv preprint arXiv:2010.06467 (2020).Google ScholarGoogle Scholar
  15. Sean MacAvaney. 2020. OpenNIR: A Complete Neural Ad-Hoc Ranking Pipeline. In WSDM 2020.Google ScholarGoogle Scholar
  16. Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, and Ophir Frieder. 2020. Expansion via prediction of importance with contextualization. In Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1573--1576.Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Bhaskar Mitra and Nick Craswell. 2018. An introduction to neural information retrieval .Now Foundations and Trends.Google ScholarGoogle Scholar
  18. Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).Google ScholarGoogle Scholar
  19. Rodrigo Nogueira, Zhiying Jiang, Ronak Pradeep, and Jimmy Lin. 2020. Document Ranking with a Pretrained Sequence-to-Sequence Model. In Findings of the Association for Computational Linguistics: EMNLP 2020. Association for Computational Linguistics, 708--718.Google ScholarGoogle ScholarCross RefCross Ref
  20. Rodrigo Nogueira, Jimmy Lin, and AI Epistemic. 2019. From doc2query to docTTTTTquery. Online preprint (2019).Google ScholarGoogle Scholar
  21. Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv preprint arXiv:1910.10683 (2019).Google ScholarGoogle Scholar
  22. Stephen Robertson and Hugo Zaragoza. 2009. The probabilistic relevance framework: BM25 and beyond .Now Publishers Inc.Google ScholarGoogle Scholar
  23. Tetsuya Sakai. 2020. On Fuhr's Guideline for IR Evaluation. In SIGIR Forum, Vol. 54. p14.Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 322--330.Google ScholarGoogle ScholarCross RefCross Ref
  25. Thomas Wolf, Lysandre Debut, Victor Sanh, Julien Chaumond, Clement Delangue, Anthony Moi, Pierric Cistac, Tim Rault, Rémi Louf, Morgan Funtowicz, et al. 2019. HuggingFace's Transformers: State-of-the-art natural language processing. arXiv preprint arXiv:1910.03771 (2019).Google ScholarGoogle Scholar
  26. Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. 2020. DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2246--2251.Google ScholarGoogle ScholarCross RefCross Ref
  27. Lee Xiong, Chenyan Xiong, Ye Li, Kwok-Fung Tang, Jialin Liu, Paul Bennett, Junaid Ahmed, and Arnold Overwijk. 2020. Approximate nearest neighbor negative contrastive learning for dense text retrieval. arXiv preprint arXiv:2007.00808 (2020).Google ScholarGoogle Scholar
  28. Peilin Yang, Hui Fang, and Jimmy Lin. 2018. Anserini: Reproducible ranking baselines using Lucene. Journal of Data and Information Quality (JDIQ), Vol. 10, 4 (2018), 1--20.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. ChengXiang Zhai et al. 2008. Statistical Language Models for Information Retrieval A Critical Review. Foundations and Trends® in Information Retrieval, Vol. 2, 3 (2008), 137--213.Google ScholarGoogle Scholar
  30. Chengxiang Zhai and John Lafferty. 2004. A study of smoothing methods for language models applied to information retrieval. ACM Transactions on Information Systems (TOIS), Vol. 22, 2 (2004), 179--214.Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Jingtao Zhan, Jiaxin Mao, Yiqun Liu, Min Zhang, and Shaoping Ma. 2020. RepBERT: Contextualized Text Embeddings for First-Stage Retrieval. arXiv preprint arXiv:2006.15498 (2020).Google ScholarGoogle Scholar
  32. Wangchunshu Zhou, Canwen Xu, Tao Ge, Julian McAuley, Ke Xu, and Furu Wei. 2020. BERT Loses Patience: Fast and Robust Inference with Early Exit. Advances in Neural Information Processing Systems, Vol. 33 (2020).Google ScholarGoogle Scholar
  33. Shengyao Zhuang, Hang Li, and Guido Zuccon. 2021. Deep Query Likelihood Model for Information Retrieval. In The 43rd European Conference On Information Retrieval (ECIR).Google ScholarGoogle Scholar
  34. Justin Zobel and Lida Rashidi. 2020. Corpus Bootstrapping for Assessment of the Properties of Effectiveness Measures. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management. 1933--1952.Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. TILDE: Term Independent Likelihood moDEl for Passage Re-ranking

      Recommendations

      Comments

      Login options

      Check if you have access through your login credentials or your institution to get full access on this article.

      Sign in
      • Published in

        cover image ACM Conferences
        SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
        July 2021
        2998 pages
        ISBN:9781450380379
        DOI:10.1145/3404835

        Copyright © 2021 ACM

        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        • Published: 11 July 2021

        Permissions

        Request permissions about this article.

        Request Permissions

        Check for updates

        Qualifiers

        • research-article

        Acceptance Rates

        Overall Acceptance Rate792of3,983submissions,20%

      PDF Format

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader