Reliably Calibrated Isotonic Regression

Nyberg, Otto; Klami, Arto

doi:10.1007/978-3-030-75762-5_46

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12712))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

3701 Accesses
2 Citations

Abstract

Using classifiers for decision making requires well-calibrated probabilities for estimation of expected utility. Furthermore, knowledge of the reliability is needed to quantify uncertainty. Outputs of most classifiers can be calibrated, typically by using isotonic regression that bins classifier outputs together to form empirical probability estimates. However, especially for highly imbalanced problems it produces bins with few samples resulting in probability estimates with very large uncertainty. We provide a formal method for quantifying the reliability of calibration and extend isotonic regression to provide reliable calibration with guarantees for width of credible intervals of the probability estimates. We demonstrate the method in calibrating purchase probabilities in e-commerce and achieve significant reduction in uncertainty without compromising accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A Python implementation of the algorithm and the data used in the experiments are available at https://github.com/Trinli/calibration.

References

Ayer, M., Brunk, H.D., Ewing, G.M., Reid, W.T., Silverman, E.: An empirical distribution function for sampling with incomplete information. Ann. Math. Stat. 26(4), 641–647 (1955)
Article MathSciNet Google Scholar
de Bruijne, M.: Machine learning approaches in medical image analysis: from detection to diagnosis. Med. Image Anal. 33(5) (2016)
Google Scholar
Diemert, E., Betlei, A., Renaudin, C., Massih-Reza, A.: A large scale benchmark for uplift modeling. In: Proceedings of the AdKDD and TargetAd Workshop, KDD (2018)
Google Scholar
Fawcett, T., Niculescu-Mizil, A.: PAV and the ROC convex hull. Mach. Learn. 68(1), 97–106 (2007)
Article Google Scholar
Gelman, A., Carlin, J., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis. Chapman & Hall (2004)
Google Scholar
Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q.: On calibration of modern neural networks. In: Proceedings of the 34th International Conference on Machine Learning, Sydney (2017)
Google Scholar
Kumar, A., Liang, P., Ma, T.: Verified uncertainty calibration. In: Advances in Neural Information Processing. No. NeurIPS (2019)
Google Scholar
Louzada, F., Ara, A., Fernandes, G.B.: Classification methods applied to credit scoring: systematic review and overall comparison. Surv. Oper. Res. Manag. Sci. 21(2), 117–134 (2016)
MathSciNet Google Scholar
McMahan, H.B., et al.: Ad click prediction: a view from the trenches. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1222–1230 (2013)
Google Scholar
Murphy, A.H., Winkler, R.L.: Reliability of subjective probability forecasts of precipitation and temperature. Source J. Royal Stat. Soc. Series C (Appl. Stat.) 26(1), 41–47 (1977)
Google Scholar
Murphy, K.: Machine Learning: A Probabilistic Perspective. The MIT Press, Cambridge (2011)
MATH Google Scholar
Naeini, M.P., Cooper, G.F.: Binary classifier calibration using an ensemble of near isotonic regression models. In: 2016 IEEE 16th International Conference on Data Mining, pp. 360–369 (2016)
Google Scholar
Naeini, M.P., Cooper, G.F., Hauskrecht, M.: Obtaining well calibrated probabilities using Bayesian binning. In: Proceedings of the 29th AAAI Conference on Artificial Intelligence, pp. 2901–2907 (2015)
Google Scholar
Niculescu-Mizil, A., Caruana, R.: Obtaining calibrated probabilities from boosting. In: Proceedings of Uncertainty in Artificial Intelligence, pp. 413–420 (2005)
Google Scholar
Niculescu-Mizil, A., Caruana, R.: Predicting good probabilities with supervised learning. In: Proceedings of the 22nd International Conference on Machine Learning, ICML 2005, pp. 625–632. No. 1999 (2005)
Google Scholar
Platt, J.: Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. Adv. Large Margin Classif. 10(3), 61–74 (1999)
Google Scholar
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42, 203–231 (2001)
Article Google Scholar
Rubinstein, M.: Markowitz’s “portfolio selection”: a fifty-year retrospective. J. Financ. 57(3), 1041–1045 (2002)
Article Google Scholar
Shmueli-Scheuer, M., Roitman, H., Carmel, D., Mass, Y., Konopnicki, D.: Extracting user profiles from large scale data. In: Proceedings of the 2010 Workshop on Massive Data Analytics on the Cloud, pp. 1–6 (2010)
Google Scholar
Van Calster, B., et al.: Calibration: the Achilles heel of predictive analytics. BMC Med. 17(1), 1–7 (2019)
Google Scholar
Zadrozny, B., Elkan, C.: Obtaining calibrated probability estimates from decision trees and Naive Bayesian classifiers. In: Proceedings of International Conference on Machine Learning, pp. 1–8 (2001)
Google Scholar
Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of the eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)
Google Scholar

Download references

Acknowledgement

This work was supported by the Academy of Finland (Flagship programme: Finnish Center for Artificial Intelligence, FCAI).

Author information

Authors and Affiliations

University of Helsinki, Helsinki, Finland
Otto Nyberg & Arto Klami

Authors

Otto Nyberg
View author publications
You can also search for this author in PubMed Google Scholar
Arto Klami
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Otto Nyberg .

Editor information

Editors and Affiliations

IIIT, Hyderabad, Hyderabad, India
Kamal Karlapalem
Chinese University of Hong Kong, Shatin, Hong Kong
Hong Cheng
Virginia Tech, Arlington, VA, USA
Naren Ramakrishnan
Jawaharlal Nehru University, New Delhi, India
R. K. Agrawal
IIIT Hyderabad, Hyderabad, India
P. Krishna Reddy
University of Minnesota, Minneapolis, MN, USA
Jaideep Srivastava
IIIT Delhi, New Delhi, India
Tanmoy Chakraborty

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nyberg, O., Klami, A. (2021). Reliably Calibrated Isotonic Regression. In: Karlapalem, K., et al. Advances in Knowledge Discovery and Data Mining. PAKDD 2021. Lecture Notes in Computer Science(), vol 12712. Springer, Cham. https://doi.org/10.1007/978-3-030-75762-5_46

Download citation

DOI: https://doi.org/10.1007/978-3-030-75762-5_46
Published: 09 May 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75761-8
Online ISBN: 978-3-030-75762-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics