short-paper

Arm: Efficient Learning of Neural Retrieval Models with Desired Accuracy by Automatic Knowledge Amalgamation

Authors:

Linzhu Yu,

Dawei Jiang,

Ke Chen,

Lidan ShouAuthors Info & Claims

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 3345 - 3349

https://doi.org/10.1145/3477495.3531664

Published: 07 July 2022 Publication History

Get Access

Abstract

In recent years, there has been increasing interest in adopting published neural retrieval models learned from corpora for text retrieval. Although these models achieve excellent retrieval performance, in terms of popular accuracy metrics, on datasets they have been trained, their performance on new text data might degrade. To obtain the desired retrieval performance on both the data used in training and the latest data collected after training, the simple approach of learning a new model from both datasets is not always feasible since the annotated dataset used in training is often not published along with the learned model. Knowledge amalgamation (KA) is an emerging technique to deal with this problem of inaccessibility of data used in previous training. KA learns a new model (called a student model) from new data by reusing (called amalgamating) a number of trained models (called teacher models) instead of accessing the teachers' original training data. However, in order to efficiently learn an accurate student model, the classical KA approach requires manual selection of an appropriate subset of teacher models for amalgamation. This manual procedure for selecting teacher models prevents the classical KA from being scaled to retrieval tasks for which a large number of candidate teacher models are ready to be reused.

This paper presents Arm, an intelligent system for efficiently learning a neural retrieval model with the desired accuracy on incoming data by automatically amalgamating a subset of teacher models (called a teacher model combination or simply combination ) among a large number of teacher models. o filter combinations that fail to produce accurate student models, Arm employs Bayesian optimization to derive an accuracy prediction model based on sampled amalgamation tasks. Then, Arm uses the derived prediction model to exclude unqualified combinations without training the rest combinations.

To speed up training, Arm introduces a cost model that picks the teacher model combination with the minimal training cost among all qualified teacher model combinations to produce the final student model. This paper will demonstrate the major workflow of Arm and present the produced student models to users.

References

[1]

Yixing Fan, Jiafeng Guo, Yanyan Lan, Jun Xu, Chengxiang Zhai, and Xueqi Cheng. 2018. Modeling Diverse Relevance Patterns in Ad-hoc Retrieval. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval, SIGIR 2018, Ann Arbor, MI, USA, July 08-12, 2018, Kevyn Collins-Thompson, Qiaozhu Mei, Brian D. Davison, Yiqun Liu, and Emine Yilmaz (Eds.). ACM, 375--384. https://doi.org/10.1145/3209978.3209980

Digital Library

Google Scholar

[2]

Luyu Gao and Jamie Callan. 2021. Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval. CoRR, Vol. abs/2108.05540 (2021). showeprint[arXiv]2108.05540 https://arxiv.org/abs/2108.05540

Google Scholar

[3]

Sebastian Hofstätter, Sophia Althammer, Michael Schröder, Mete Sertkan, and Allan Hanbury. 2020. Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation. CoRR, Vol. abs/2010.02666 (2020). [arXiv]2010.02666 https://arxiv.org/abs/2010.02666

Google Scholar

[4]

Sebastian Hofst"a tter, Sheng-Chieh Lin, Jheng-Hong Yang, Jimmy Lin, and Allan Hanbury. 2021. Efficiently Teaching an Effective Dense Retriever with Balanced Topic Aware Sampling. In SIGIR. ACM, 113--122. https://doi.org/10.1145/3404835.3462891

Digital Library

Google Scholar

[5]

Donald R. Jones, Matthias Schonlau, and William J. Welch. 1998. Efficient Global Optimization of Expensive Black-Box Functions. J. Glob. Optim., Vol. 13, 4 (1998), 455--492. https://doi.org/10.1023/A:1008306431147

Digital Library

Google Scholar

[6]

Omar Khattab and Matei Zaharia. 2020. ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT. In SIGIR. ACM, 39--48. https://doi.org/10.1145/3397271.3401075

Digital Library

Google Scholar

[7]

Sean MacAvaney, Franco Maria Nardini, Raffaele Perego, Nicola Tonellotto, Nazli Goharian, and Ophir Frieder. 2020. Expansion via Prediction of Importance with Contextualization. In Proceedings of the 43rd International ACM SIGIR conference on research and development in Information Retrieval, SIGIR 2020, Virtual Event, China, July 25--30, 2020, Jimmy Huang, Yi Chang, Xueqi Cheng, Jaap Kamps, Vanessa Murdock, Ji-Rong Wen, and Yiqun Liu (Eds.). ACM, 1573--1576. https://doi.org/10.1145/3397271.3401262

Digital Library

Google Scholar

[8]

Chengchao Shen, Xinchao Wang, Jie Song, Li Sun, and Mingli Song. 2019. Amalgamating Knowledge towards Comprehensive Classification. In AAAI. AAAI Press, 3068--3075. https://doi.org/10.1609/aaai.v33i01.33013068

Digital Library

Google Scholar

[9]

Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. 2012. Practical Bayesian Optimization of Machine Learning Algorithms. In NIPS. 2960--2968. https://proceedings.neurips.cc/paper/2012/hash/05311655a15b75fab86956663e1819cd-Abstract.html

Digital Library

Google Scholar

[10]

Chenyan Xiong, Zhuyun Dai, Jamie Callan, Zhiyuan Liu, and Russell Power. 2017. End-to-End Neural Ad-hoc Ranking with Kernel Pooling. In Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval, Shinjuku, Tokyo, Japan, August 7-11, 2017, Noriko Kando, Tetsuya Sakai, Hideo Joho, Hang Li, Arjen P. de Vries, and Ryen W. White (Eds.). ACM, 55--64. https://doi.org/10.1145/3077136.3080809

Digital Library

Google Scholar

Cited By

View all

Ma MLi YChen YChen LZhou Y(2025)Why and How We Combine Multiple Deep Learning Models With Functional OverlapsJournal of Software: Evolution and Process10.1002/smr.7000337:2Online publication date: 16-Feb-2025
https://doi.org/10.1002/smr.70003

Index Terms

Arm: Efficient Learning of Neural Retrieval Models with Desired Accuracy by Automatic Knowledge Amalgamation
1. Information systems
  1. Information retrieval
    1. Retrieval models and ranking
      1. Novelty in information retrieval

Recommendations

Semi-supervised learning to improve generalizability of risk prediction models
Graphical abstract

Display Omitted
Highlights
- Semi-supervised models were used to overcome generalizability of prediction models.
Abstract
The utility of a prediction model depends on its generalizability to patients drawn from different but related populations. We explored whether a semi-supervised learning model could improve the generalizability of colorectal cancer (...
BoKA: Bayesian Optimization based Knowledge Amalgamation for Multi-unknown-domain Text Classification
KDD '24: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

With breakthroughs in pretrained language models, a large number of finetuned models specialized in distinct domains have surfaced online. Yet, when faced with a fresh dataset covering multiple (sub)domains, their performance might degrade. Reusing these ...
Machine learning-based intradialytic hypotension prediction of patients undergoing hemodialysis: A multicenter retrospective study
Highlights
- A risk prediction for IDH in HD-patients can be an important tool for clinical work.
- LightGBM model plays as an interpretable and best-performing model for the task.
- IDH-A and IDH-B model can usefully complement each other for risk ...
Abstract Background and Objective
Intradialytic hypotension (IDH) is closely associated with adverse clinical outcomes in HD-patients. An IDH predictor model is important for IDH risk screening and clinical decision-making. In this study, we used Machine ...
Graphic abstract

Display Omitted

Comments

Information & Contributors

Information

Published In

SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2022

3569 pages

ISBN:9781450387323

DOI:10.1145/3477495

General Chairs:
Enrique Amigo
UNED
,
Pablo Castells
UAM and Amazon
,
Julio Gonzalo
UNED
,
Program Chairs:
Ben Carterette
Spotify
,
J. Shane Culpepper
RMIT University
,
Gabriella Kazai
Waseda University

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Short-paper

Funding Sources

Natural Science Foundation of China
Key Research and Development Program of Zhejiang Province of China

Conference

SIGIR '22

Sponsor:

SIGIR

SIGIR '22: The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 11 - 15, 2022

Madrid, Spain

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
133
Total Downloads

Downloads (Last 12 months)16
Downloads (Last 6 weeks)1

Reflects downloads up to 28 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View all

Ma MLi YChen YChen LZhou Y(2025)Why and How We Combine Multiple Deep Learning Models With Functional OverlapsJournal of Software: Evolution and Process10.1002/smr.7000337:2Online publication date: 16-Feb-2025
https://doi.org/10.1002/smr.70003

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Abstract

References

Cited By

Index Terms

Recommendations

Semi-supervised learning to improve generalizability of risk prediction models

BoKA: Bayesian Optimization based Knowledge Amalgamation for Multi-unknown-domain Text Classification

Machine learning-based intradialytic hypotension prediction of patients undergoing hemodialysis: A multicenter retrospective study

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Login options

Full Access

View options

PDF

eReader

Share

Share this Publication link

Share on social media

Affiliations