skip to main content
10.1145/3477495.3531898acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
short-paper
Public Access

Inconsistent Ranking Assumptions in Medical Search and Their Downstream Consequences

Published: 07 July 2022 Publication History

Abstract

Given a query, neural retrieval models predict point estimates of relevance for each document; however, a significant drawback of relying solely on point estimates is that they contain no indication of the model's confidence in its predictions. Despite this lack of information, downstream methods such as reranking, cutoff prediction, and none-of-the-above classification are still able to learn effective functions to accomplish their respective tasks. Unfortunately, these downstream methods can suffer poor performance when the initial ranking model loses confidence in its score predictions. This becomes increasingly important in high-stakes settings, such as medical searches that can influence health decision making.
Recent work has resolved this lack of information by introducing Bayesian uncertainty to capture the possible distribution of a document score. This paper presents the use of this uncertainty information as an indicator of how well downstream methods will function over a ranklist. We highlight a significant bias against certain disease-related queries within the posterior distribution of a neural model, and show that this bias in a model's predictive distribution propagates to downstream methods. Finally, we introduce a multi-distribution uncertainty metric, confidence decay, as a valid way of partially identifying these failure cases in an offline setting without the need of any user feedback.

References

[1]
Dara Bahri, Yi Tay, Che Zheng, Donald Metzler, and Andrew Tomkins. 2020. Choppy: Cut Transformer for Ranked List Truncation .ACM, New York, NY, USA, 1513--1516. https://doi.org/10.1145/3397271.3401188
[2]
Alex Beutel, Jilin Chen, Tulsee Doshi, Hai Qian, Li Wei, Yi Wu, Lukasz Heldt, Zhe Zhao, Lichan Hong, Ed H. Chi, and Cristos Goodrow. 2019. Fairness in Recommendation Ranking through Pairwise Comparisons. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM, Anchorage AK USA, 2212--2220. https://doi.org/10.1145/3292500.3330745
[3]
Asia J. Biega, Krishna P. Gummadi, and Gerhard Weikum. 2018. Equity of Attention: Amortizing Individual Fairness in Rankings . The 41st International ACM SIGIR (June 2018), 405--414. https://doi.org/10.1145/3209978.3210063 arXiv: 1805.01788.
[4]
Daniel Cohen, Bhaskar Mitra, Oleg Lesota, Navid Rekabsaz, and Carsten Eickhoff. 2021. Not All Relevance Scores are Equal: Efficient Uncertainty and Calibration Modeling for Deep Retrieval Models . arXiv:2105.04651 [cs] (May 2021). https://doi.org/10.1145/3404835.3462951 arXiv: 2105.04651.
[5]
J. Shane Culpepper, Charles L. A. Clarke, and Jimmy Lin. 2016. Dynamic Cutoff Prediction in Multi-Stage Retrieval Systems. In Proceedings of the 21st ADCS. ACM, Caulfield VIC Australia, 17--24. https://doi.org/10.1145/3015022.3015026
[6]
Fernando Diaz, Bhaskar Mitra, Michael D. Ekstrand, Asia J. Biega, and Ben Carterette. 2020. Evaluating Stochastic Rankings with Expected Exposure . Proceedings of the 29th ACM International CIKM (Oct. 2020), 275--284. https://doi.org/10.1145/3340531.3411962 arXiv: 2004.13157.
[7]
Cynthia Dwork, Moritz Hardt, Toniann Pitassi, Omer Reingold, and Richard Zemel. 2012. Fairness through Awareness. In Proceedings of the 3rd ITCS (Cambridge, Massachusetts) (ITCS '12). ACM, New York, NY, USA, 214--226. https://doi.org/10.1145/2090236.2090255
[8]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning . arXiv:1506.02142 [cs, stat] (Oct. 2016). http://arxiv.org/abs/1506.02142 arXiv: 1506.02142.
[9]
Yarin Gal, Jiri Hron, and Alex Kendall. 2017. Concrete Dropout . arXiv:1705.07832 [stat] (May 2017). http://arxiv.org/abs/1705.07832 arXiv: 1705.07832.
[10]
Eyke Hüllermeier and Willem Waegeman. 2020. Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods . arXiv:1910.09457 [cs, stat] (Sept. 2020). http://arxiv.org/abs/1910.09457 arXiv: 1910.09457.
[11]
Yen-Chieh Lien, Daniel Cohen, and W. Bruce Croft. 2019. An Assumption-Free Approach to the Dynamic Truncation of Ranked Lists. In Proceedings of the 2019 ACM SIGIR. ACM, Santa Clara CA USA, 79--82. https://doi.org/10.1145/3341981.3344234
[12]
Rishabh Mehrotra, James McInerney, Hugues Bouchard, Mounia Lalmas, and Fernando Diaz. 2018. Towards a Fair Marketplace: Counterfactual Evaluation of the trade-off between Relevance, Fairness & Satisfaction in Recommendation Systems. In Proceedings of the 27th ACM International CIKM. ACM, Torino Italy, 2243--2251. https://doi.org/10.1145/3269206.3272027
[13]
Rodrigo Nogueira and Kyunghyun Cho. 2020. Passage Re-ranking with BERT . arXiv:1901.04085 [cs] (April 2020). http://arxiv.org/abs/1901.04085 arXiv: 1901.04085.
[14]
Gustavo Penha and Claudia Hauff. 2021. On the Calibration and Uncertainty of Neural Learning to Rank Models . arXiv:2101.04356 [cs] (Jan. 2021). http://arxiv.org/abs/2101.04356 arXiv: 2101.04356.
[15]
Navid Rekabsaz, Oleg Lesota, Markus Schedl, Jon Brassey, and Carsten Eickhoff. 2021. TripClick: The Log Files of a Large Health Web Search Engine. In In Proceedings of the 44th International ACM SIGIR (SIGIR'21), July 11--15, 2021, Virtual Event, Canada. ACM. https://doi.org/10.1145/3404835.3463242
[16]
S.E. Robertson. 1977. The Probability Ranking Principle in IR. Journal of Documentation, Vol. 33, 4 (Jan. 1977), 294--304. https://doi.org/10.1108/eb026647 Publisher: MCB UP Ltd.
[17]
Ashudeep Singh and Thorsten Joachims. 2018. Fairness of Exposure in Rankings. In Proceedings of the 24th ACM SIGKDD. ACM, London United Kingdom, 2219--2228. https://doi.org/10.1145/3219819.3220088
[18]
Ashudeep Singh and Thorsten Joachims. 2019. Policy Learning for Fairness in Ranking. arxiv: 1902.04056 [cs.LG]

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2022
3569 pages
ISBN:9781450387323
DOI:10.1145/3477495
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 July 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. bayesian
  2. bias
  3. fairness
  4. information retrieval
  5. medical search
  6. uncertainty

Qualifiers

  • Short-paper

Funding Sources

Conference

SIGIR '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 219
    Total Downloads
  • Downloads (Last 12 months)90
  • Downloads (Last 6 weeks)12
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media