skip to main content
10.1145/3539618.3594243acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
tutorial

Uncertainty Quantification for Text Classification

Published: 18 July 2023 Publication History

Abstract

This full-day tutorial introduces modern techniques for practical uncertainty quantification specifically in the context of multi-class and multi-label text classification. First, we explain the usefulness of estimating aleatoric uncertainty and epistemic uncertainty for text classification models. Then, we describe several state-of-the-art approaches to uncertainty quantification and analyze their scalability to big text data: Virtual Ensemble in GBDT, Bayesian Deep Learning (including Deep Ensemble, Monte-Carlo Dropout, Bayes by Backprop, and their generalization Epistemic Neural Networks), Evidential Deep Learning (including Prior Networks and Posterior Networks), as well as Distance Awareness (including Spectral-normalized Neural Gaussian Process and Deep Deterministic Uncertainty). Next, we talk about the latest advances in uncertainty quantification for pre-trained language models (including asking language models to express their uncertainty, interpreting uncertainties of text classifiers built on large-scale language models, uncertainty estimation in text generation, calibration of language models, and calibration for in-context learning). After that, we discuss typical application scenarios of uncertainty quantification in text classification (including in-domain calibration, cross-domain robustness, and novel class detection). Finally, we list popular performance metrics for the evaluation of uncertainty quantification effectiveness in text classification. Practical hands-on examples/exercises are provided to the attendees for them to experiment with different uncertainty quantification methods on a few real-world text classification datasets such as CLINC150.

References

[1]
Moloud Abdar, Farhad Pourpanah, Sadiq Hussain, Dana Rezazadegan, Li Liu, Mohammad Ghavamzadeh, Paul Fieguth, Xiaochun Cao, Abbas Khosravi, U. Rajendra Acharya, Vladimir Makarenkov, and Saeid Nahavandi. 2021. A Review of Uncertainty Quantification in Deep Learning: Techniques, Applications and Challenges. Information Fusion, Vol. 76 (Dec. 2021), 243--297. https://doi.org/10.1016/j.inffus.2021.05.008
[2]
Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight Uncertainty in Neural Network. In Proceedings of the 32nd International Conference on Machine Learning. PMLR, 1613--1622.
[3]
Bertrand Charpentier, Daniel Zügner, and Stephan Günnemann. 2020. Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc., 1356--1367.
[4]
Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning. In Proceedings of The 33rd International Conference on Machine Learning. PMLR, 1050--1059.
[5]
Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, Muhammad Shahzad, Wen Yang, Richard Bamler, and Xiao Xiang Zhu. 2022. A Survey of Uncertainty in Deep Neural Networks. https://doi.org/10.48550/arXiv.2107.03342 arxiv: 2107.03342 [cs, stat]
[6]
Chuan Guo, Geoff Pleiss, Yu Sun, and Kilian Q. Weinberger. 2017. On Calibration of Modern Neural Networks. In Proceedings of the 34th International Conference on Machine Learning - Volume 70 (ICML '17). JMLR.org, Sydney, NSW, Australia, 1321--1330.
[7]
Zhen Guo, Zelin Wan, Qisheng Zhang, Xujiang Zhao, Feng Chen, Jin-Hee Cho, Qi Zhang, Lance M. Kaplan, Dong H. Jeong, and Audun Jøsang. 2022. A Survey on Uncertainty Reasoning and Quantification for Decision Making: Belief Theory Meets Deep Learning. https://doi.org/10.48550/arXiv.2206.05675 arxiv: 2206.05675 [cs, math]
[8]
Reihaneh H. Hariri, Erik M. Fredericks, and Kate M. Bowers. 2019. Uncertainty in Big Data Analytics: Survey, Opportunities, and Challenges. Journal of Big Data, Vol. 6, 1 (June 2019), 44. https://doi.org/10.1186/s40537-019-0206-3
[9]
Dan Hendrycks, Steven Basart, Mantas Mazeika, Andy Zou, Joseph Kwon, Mohammadreza Mostajabi, Jacob Steinhardt, and Dawn Song. 2022. Scaling Out-of-Distribution Detection for Real-World Settings. In Proceedings of the 39th International Conference on Machine Learning. PMLR, 8759--8773.
[10]
Dan Hendrycks and Kevin Gimpel. 2017. A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks. In International Conference on Learning Representations.
[11]
Dan Hendrycks, Mantas Mazeika, and Thomas G. Dietterich. 2019. Deep Anomaly Detection with Outlier Exposure. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net.
[12]
Neil Houlsby, Ferenc Huszár, Zoubin Ghahramani, and Máté Lengyel. 2011. Bayesian Active Learning for Classification and Preference Learning. https://doi.org/10.48550/arXiv.1112.5745 arxiv: 1112.5745 [cs, stat]
[13]
Yibo Hu and Latifur Khan. 2021. Uncertainty-Aware Reliable Text Classification. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (KDD '21). Association for Computing Machinery, New York, NY, USA, 628--636. https://doi.org/10.1145/3447548.3467382
[14]
Eyke Hüllermeier and Willem Waegeman. 2021. Aleatoric and Epistemic Uncertainty in Machine Learning: An Introduction to Concepts and Methods. Machine Learning, Vol. 110, 3 (March 2021), 457--506. https://doi.org/10.1007/s10994-021-05946-3
[15]
Zhengbao Jiang, Jun Araki, Haibo Ding, and Graham Neubig. 2021. How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering. Transactions of the Association for Computational Linguistics, Vol. 9 (Sept. 2021), 962--977. https://doi.org/10.1162/tacl_a_00407
[16]
H. M. Dipu Kabir, Abbas Khosravi, Mohammad Anwar Hosen, and Saeid Nahavandi. 2018. Neural Network-Based Uncertainty Quantification: A Survey of Methodologies and Applications. IEEE Access, Vol. 6 (2018), 36218--36234. https://doi.org/10.1109/ACCESS.2018.2836917
[17]
Saurav Kadavath, Tom Conerly, Amanda Askell, Tom Henighan, Dawn Drain, Ethan Perez, Nicholas Schiefer, Zac Hatfield-Dodds, Nova DasSarma, Eli Tran-Johnson, Scott Johnston, Sheer El-Showk, Andy Jones, Nelson Elhage, Tristan Hume, Anna Chen, Yuntao Bai, Sam Bowman, Stanislav Fort, Deep Ganguli, Danny Hernandez, Josh Jacobson, Jackson Kernion, Shauna Kravec, Liane Lovitt, Kamal Ndousse, Catherine Olsson, Sam Ringer, Dario Amodei, Tom Brown, Jack Clark, Nicholas Joseph, Ben Mann, Sam McCandlish, Chris Olah, and Jared Kaplan. 2022. Language Models (Mostly ) Know What They Know. https://doi.org/10.48550/arXiv.2207.05221 arxiv: 2207.05221 [cs]
[18]
Alex Kendall and Yarin Gal. 2017. What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision?. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.
[19]
Lorenz Kuhn, Yarin Gal, and Sebastian Farquhar. 2023. Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation. In The Eleventh International Conference on Learning Representations.
[20]
Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and Scalable Predictive Uncertainty Estimation Using Deep Ensembles. In Advances in Neural Information Processing Systems, Vol. 30. Curran Associates, Inc.
[21]
Stefan Larson, Anish Mahendran, Joseph J. Peper, Christopher Clarke, Andrew Lee, Parker Hill, Jonathan K. Kummerfeld, Kevin Leach, Michael A. Laurenzano, Lingjia Tang, and Jason Mars. 2019. An Evaluation Dataset for Intent Classification and Out-of-Scope Prediction. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP ). Association for Computational Linguistics, Hong Kong, China, 1311--1316. https://doi.org/10.18653/v1/D19-1131
[22]
Christian Leibig, Vaneeda Allken, Murat Sec ckin Ayhan, Philipp Berens, and Siegfried Wahl. 2017. Leveraging Uncertainty Information from Deep Neural Networks for Disease Detection. Scientific Reports, Vol. 7, 1 (Dec. 2017), 17816. https://doi.org/10.1038/s41598-017-17876-z
[23]
Hongjing Li, Hanqi Yan, Yanran Li, Li Qian, Yulan He, and Lin Gui. 2023. Distinguishability Calibration to In-Context Learning. https://doi.org/10.48550/arXiv.2302.06198 arxiv: 2302.06198 [cs]
[24]
Stephanie Lin, Jacob Hilton, and Owain Evans. 2022. Teaching Models to Express Their Uncertainty in Words. https://doi.org/10.48550/arXiv.2205.14334 arxiv: 2205.14334 [cs]
[25]
Jeremiah Zhe Liu, Shreyas Padhy, Jie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zachary Nado, Jasper Snoek, Dustin Tran, and Balaji Lakshminarayanan. 2023. A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness. Journal of Machine Learning Research, Vol. 24, 42 (2023), 1--63.
[26]
Andrey Malinin and Mark Gales. 2018. Predictive Uncertainty Estimation via Prior Networks. In Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, Inc.
[27]
Andrey Malinin, Liudmila Prokhorenkova, and Aleksei Ustimenko. 2021. Uncertainty in Gradient Boosting via Ensembles. In International Conference on Learning Representations.
[28]
José Mena, Oriol Pujol, and Jordi Vitrià. 2021. A Survey on Uncertainty Estimation in Deep Learning Classification Systems from a Bayesian Perspective. Comput. Surveys, Vol. 54, 9 (Oct. 2021), 193:1--193:35. https://doi.org/10.1145/3477140
[29]
Jishnu Mukhoti, Andreas Kirsch, Joost van Amersfoort, Philip H. S. Torr, and Yarin Gal. 2022. Deep Deterministic Uncertainty: A Simple Baseline. https://doi.org/10.48550/arXiv.2102.11582 arxiv: 2102.11582 [cs, stat]
[30]
Ian Osband, Zheng Wen, Seyed Mohammad Asghari, Vikranth Dwaracherla, Morteza Ibrahimi, Xiuyuan Lu, and Benjamin Van Roy. 2023. Epistemic Neural Networks. https://doi.org/10.48550/arXiv.2107.08924 arxiv: 2107.08924 [cs, stat]
[31]
Apostolos F. Psaros, Xuhui Meng, Zongren Zou, Ling Guo, and George Em Karniadakis. 2022. Uncertainty Quantification in Scientific Machine Learning: Methods, Metrics, and Comparisons. https://doi.org/10.48550/arXiv.2201.07766 arxiv: 2201.07766 [cs]
[32]
Murat Sensoy, Lance Kaplan, and Melih Kandemir. 2018. Evidential Deep Learning to Quantify Classification Uncertainty. In Advances in Neural Information Processing Systems, Vol. 31. Curran Associates, Inc.
[33]
Aditya Siddhant and Zachary C. Lipton. 2018. Deep Bayesian Active Learning for Natural Language Processing: Results of a Large-Scale Empirical Study. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, Brussels, Belgium, 2904--2909. https://doi.org/10.18653/v1/D18-1318
[34]
Dennis Ulmer, Christian Hardmeier, and Jes Frellsen. 2023. Prior and Posterior Networks: A Survey on Evidential Deep Learning Methods for Uncertainty Estimation. https://doi.org/10.48550/arXiv.2110.03051 arxiv: 2110.03051 [cs, stat]
[35]
Jordy Van Landeghem, Matthew Blaschko, Bertrand Anckaert, and Marie-Francine Moens. 2022. Benchmarking Scalable Predictive Uncertainty in Text Classification. IEEE Access, Vol. 10 (2022), 43703--43737. https://doi.org/10.1109/ACCESS.2022.3168734
[36]
Haoran Wang, Weitang Liu, Alex Bocchieri, and Yixuan Li. 2021. Can Multi-Label Classification Networks Know What They Don't Know?. In Advances in Neural Information Processing Systems, Vol. 34. Curran Associates, Inc., 29074--29087.
[37]
David Widmann, Fredrik Lindsten, and Dave Zachariah. 2019. Calibration Tests in Multi-class Classification: A Unifying Framework. In Advances in Neural Information Processing Systems, Vol. 32. Curran Associates, Inc.
[38]
David Widmann, Fredrik Lindsten, and Dave Zachariah. 2021. Calibration Tests Beyond Classification. In International Conference on Learning Representations.
[39]
Dell Zhang, Murat Sensoy, Masoud Makrehchi, and Bilyana Taneva-Popova. 2023. Uncertainty Quantification for Text Classification. In Advances in Information Retrieval - 45th European Conference on Information Retrieval, ECIR 2023, Dublin, Ireland, April 2-6, 2023, Proceedings, Part III (Lecture Notes in Computer Science ), Jaap Kamps, Lorraine Goeuriot, Fabio Crestani, Maria Maistro, Hideo Joho, Brian Davis, Cathal Gurrin, Udo Kruschwitz, and Annalina Caputo (Eds.). Springer Nature Switzerland, Cham, 362--369. https://doi.org/10.1007/978-3-031-28241-6_38

Cited By

View all
  • (2024)Towards Robust Information Extraction via Binomial Distribution Guided Counterpart SequenceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672067(83-94)Online publication date: 25-Aug-2024
  • (2024)TAE: Topic-aware encoder for large-scale multi-label text classificationApplied Intelligence10.1007/s10489-024-05485-z54:8(6269-6284)Online publication date: 9-May-2024
  • (2024)Logarithm of Maximum Posterior Evidence: Advanced Model Selection for Text ClassificationKnowledge Science, Engineering and Management10.1007/978-981-97-5495-3_17(229-240)Online publication date: 16-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2023
3567 pages
ISBN:9781450394086
DOI:10.1145/3539618
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 July 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. language models
  2. text classification
  3. uncertainty quantification

Qualifiers

  • Tutorial

Funding Sources

Conference

SIGIR '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)208
  • Downloads (Last 6 weeks)22
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Robust Information Extraction via Binomial Distribution Guided Counterpart SequenceProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3672067(83-94)Online publication date: 25-Aug-2024
  • (2024)TAE: Topic-aware encoder for large-scale multi-label text classificationApplied Intelligence10.1007/s10489-024-05485-z54:8(6269-6284)Online publication date: 9-May-2024
  • (2024)Logarithm of Maximum Posterior Evidence: Advanced Model Selection for Text ClassificationKnowledge Science, Engineering and Management10.1007/978-981-97-5495-3_17(229-240)Online publication date: 16-Aug-2024
  • (2023)A Theoretical Analysis of Out-of-Distribution Detection in Multi-Label ClassificationProceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval10.1145/3578337.3605116(275-282)Online publication date: 9-Aug-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media