skip to main content
10.1145/3340531.3412071acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

TABLE: A Task-Adaptive BERT-based ListwisE Ranking Model for Document Retrieval

Published: 19 October 2020 Publication History

Abstract

Document retrieval (DR) is a crucial task in NLP. Recently, the pre-trained BERT-like language models have achieved remarkable success, obtaining a state-of-the-art result in DR. In this paper, we come up with a new BERT-based ranking model for DR task, named TABLE. In the pre-training stage of TABLE, we present a domain-adaptive strategy. More essentially, in the fine-tuning stage, we develop a two-phase task-adaptive process, i.e., type-adaptive pointwise fine-tuning and listwise fine-tuning. In the type-adaptive pointwise fine-tuning phase, the model can learn different matching patterns regarding different query types. In the listwise fine-tuning phase, the model matches documents with regard to a given query in a listwise fashion. This task-adaptive process makes the model more robust. In addition, a simple but effective exact matching feature is introduced in fine-tuning, which can effectively compute matching of out-of-vocabulary (OOV) words between a query and a document. As far as we know, we are the first who propose a listwise ranking model with BERT. This work can explore rich matching features between queries and documents. Therefore it substantially improves model performance in DR. Notably, our TABLE model shows excellent performance on the MS MARCO leaderboard.

Supplementary Material

MP4 File (3340531.3412071.mp4)
Presentation Video.

References

[1]
Chris Burges, Tal Shaked, Erin Renshaw, Ari Lazier, Matt Deeds, Nicole Hamilton, and Greg Hullender. 2005. Learning to Rank Using Gradient Descent. In Proceedings of the 22nd International Conference on Machine Learning. 89--96.
[2]
Christopher J. C. Burges, Robert Ragno, and Quoc Viet Le. 2006. Learning to Rank with Nonsmooth Cost Functions. In Proceedings of the 19th International Conference on Neural Information Processing Systems. 193--200.
[3]
Zhe Cao, Tao Qin, Tie-Yan Liu, Ming-Feng Tsai, and Hang Li. 2007. Learning to rank: from pairwise approach to listwise approach. In Proceedings of the 24th international conference on Machine learning. 129--136.
[4]
Zhuyun Dai and Jamie Callan. 2019. Context-aware sentence/passage term importance estimation for first stage retrieval. arXiv preprint arXiv:1910.10687 (2019).
[5]
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
[6]
Jerome Friedman. 2000. Greedy Function Approximation: A Gradient Boosting Machine. The Annals of Statistics, Vol. 29 (11 2000).
[7]
Jiafeng Guo, Yixing Fan, Qingyao Ai, and W. Bruce Croft. 2016. A Deep Relevance Matching Model for Ad-Hoc Retrieval. In Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 55--64.
[8]
Po-Sen Huang, Xiaodong He, Jianfeng Gao, Li Deng, Alex Acero, and Larry Heck. 2013. Learning deep structured semantic models for web search using clickthrough data. In Proceedings of the 22nd ACM international conference on Information & Knowledge Management. 2333--2338.
[9]
Kalervo J"arvelin and Jaana Kekäläinen. 2000. IR Evaluation Methods for Retrieving Highly Relevant Documents. In Proceedings of the 23rd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 41--48.
[10]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[11]
Tri Nguyen, Mir Rosenberg, Xia Song, Jianfeng Gao, Saurabh Tiwary, Rangan Majumder, and Li Deng. 2016. MS MARCO: a human-generated machine reading comprehension dataset. (2016).
[12]
Rodrigo Nogueira and Kyunghyun Cho. 2019. Passage Re-ranking with BERT. arXiv preprint arXiv:1901.04085 (2019).
[13]
Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, and Jimmy Lin. 2019. Multi-stage document ranking with BERT. arXiv preprint arXiv:1910.14424 (2019).
[14]
Tao Qin, Xu-Dong Zhang, Ming-Feng Tsai, De-Sheng Wang, Tie-Yan Liu, and Hang Li. 2008. Query-level loss functions for information retrieval. Information Processing & Management, Vol. 44, 2 (2008), 838--855.
[15]
Amnon Shashua and Anat Levin. 2002. Ranking with Large Margin Principle: Two Approaches. In Proceedings of the 15th International Conference on Neural Information Processing Systems. 961--968.
[16]
Mike Taylor, John Guiver, Stephen Robertson, and Tom Minka. 2008. SoftRank: Optimising Non-Smooth Rank Metrics. In WSDM 2008.
[17]
Qiang Wu, Chris J.C. Burges, Krysta M. Svore, and Jianfeng Gao. 2010. Adapting Bboosting for Information Retrieval Measures. Information Retrieval, Vol. 13 (2010), 254--270.
[18]
Fen Xia, Tie-Yan Liu, Jue Wang, Wensheng Zhang, and Hang Li. 2008. Listwise Approach to Learning to Rank: Theory and Algorithm. In Proceedings of the 25th International Conference on Machine Learning. 1192--1199.
[19]
Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-end neural ad-hoc ranking with kernel pooling. In Proceedings of the 40th International ACM SIGIR conference on research and development in information retrieval. 55--64.
[20]
Jun Xu and Hang Li. 2007. AdaRank: A Boosting Algorithm for Information Retrieval. In Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. 391--398.

Cited By

View all
  • (2024)Gar $$\scriptstyle ++$$: Natural Language to SQL Translation with Efficient Generate-and-RankWeb and Big Data10.1007/978-981-97-7238-4_26(411-427)Online publication date: 28-Aug-2024
  • (2024)ApplicationsUnsupervised Domain Adaptation10.1007/978-981-97-1025-6_8(213-218)Online publication date: 16-Feb-2024
  • (2024)An Adaptive Feature Selection Method for Learning-to-Enumerate ProblemAdvances in Information Retrieval10.1007/978-3-031-56063-7_8(122-136)Online publication date: 23-Mar-2024
  • Show More Cited By

Index Terms

  1. TABLE: A Task-Adaptive BERT-based ListwisE Ranking Model for Document Retrieval

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    CIKM '20: Proceedings of the 29th ACM International Conference on Information & Knowledge Management
    October 2020
    3619 pages
    ISBN:9781450368599
    DOI:10.1145/3340531
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 19 October 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. bert
    2. document retrieval
    3. neural information retrieval

    Qualifiers

    • Short-paper

    Conference

    CIKM '20
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)16
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 07 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Gar $$\scriptstyle ++$$: Natural Language to SQL Translation with Efficient Generate-and-RankWeb and Big Data10.1007/978-981-97-7238-4_26(411-427)Online publication date: 28-Aug-2024
    • (2024)ApplicationsUnsupervised Domain Adaptation10.1007/978-981-97-1025-6_8(213-218)Online publication date: 16-Feb-2024
    • (2024)An Adaptive Feature Selection Method for Learning-to-Enumerate ProblemAdvances in Information Retrieval10.1007/978-3-031-56063-7_8(122-136)Online publication date: 23-Mar-2024
    • (2023)Gar: A Generate-and-Rank Approach for Natural Language to SQL Translation2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00016(110-122)Online publication date: Apr-2023
    • (2023)Multi-task Learning Based Keywords Weighted Siamese Model for Semantic RetrievalAdvances in Knowledge Discovery and Data Mining10.1007/978-3-031-33380-4_7(86-98)Online publication date: 27-May-2023
    • (2022)Multitask Fine-Tuning for Passage Re-Ranking Using BM25 and Pseudo Relevance FeedbackIEEE Access10.1109/ACCESS.2022.317689410(54254-54262)Online publication date: 2022
    • (2021)Self-supervised Fine-tuning for Efficient Passage Re-rankingProceedings of the 30th ACM International Conference on Information & Knowledge Management10.1145/3459637.3482179(3142-3146)Online publication date: 26-Oct-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media