Skip to main content

Classifier Calibration

  • Reference work entry
  • First Online:

Abstract

Classifier calibration is concerned with the scale on which a classifier’s scores are expressed. While a classifier ultimately maps instances to discrete classes, it is often beneficial to decompose this mapping into a scoring classifier which outputs one or more real-valued numbers and a decision rule which converts these numbers into predicted classes. For example, a linear classifier might output a positive or negative score whose magnitude is proportional to the distance between the instance and the decision boundary, in which case the decision rule would be a simple threshold on that score. The advantage of calibrating these scores to a known, domain-independent scale is that the decision rule then also takes a domain-independent form and does not have to be learned. The best-known example of this occurs when the classifier’s scores approximate, in a precise sense, the posterior probability over the classes; the main advantage of this is that the optimal decision rule is to predict the class that minimizes expected cost averaged over all possible true classes. The main methods to obtain calibrated scores are logistic calibration, which is a parametric method that assumes that the distances on either side of the decision boundary are normally distributed and a nonparametric alternative that is variously known as isotonic regression, the pool adjacent violators (PAV) method or the ROC convex hull (ROCCH) method.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   699.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD   949.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Bella A, Ferri C, Hernández-Orallo J, Ramírez-Quintana MJ (2013) On the effect of calibration in classifier combination. Appl Intell 38(4):566–585

    Article  Google Scholar 

  • Brier G (1950) Verification of forecasts expressed in terms of probabilities. Mon Weather Rev 78:1–3

    Article  Google Scholar 

  • Drummond C, Holte R (2006) Cost curves: an improved method for visualizing classifier performance. Mach Learn 65(1):95–130

    Article  Google Scholar 

  • Elkan C (2001) The foundations of cost-sensitive learning. In: Proceedings of 17th international joint conference on artificial intelligence (IJCAI’01). Morgan Kaufmann, pp 973–978

    Google Scholar 

  • Fawcett T, Niculescu-Mizil A (2007) PAV and the ROC convex hull. Mach Learn 68(1):97–106

    Article  Google Scholar 

  • Ferri C, Flach P, Hernández-Orallo J (2003) Improving the AUC of probabilistic estimation trees. In: 14th European conference on machine learning (ECML’03). Springer, pp 121–132

    Google Scholar 

  • Flach P, Kull M (2015) Precision-recall-gain curves: PR analysis done right. In: Advances in neural information processing systems (NIPS’15), pp 838–846

    Google Scholar 

  • Gneiting T, Raftery AE (2007) Strictly proper scoring rules, prediction, and estimation. J Am Stat Assoc 102(477):359–378

    Article  MathSciNet  MATH  Google Scholar 

  • Hastie T, Tibshirani R (1998) Classification by pairwise coupling. Ann Stat 26(2):451–471

    Article  MathSciNet  MATH  Google Scholar 

  • Hernández-Orallo J, Flach P, Ferri C (2011) Brier curves: a new cost-based visualisation of classifier performance. In: Proceedings 28th international conference on machine learning (ICML’11), pp 585–592

    Google Scholar 

  • Hernández-Orallo J, Flach P, Ferri C (2012) A unified view of performance metrics: translating threshold choice into expected classification loss. J Mach Learn Res 13(1):2813–2869

    MathSciNet  MATH  Google Scholar 

  • Kong EB, Dietterich T (1997) Probability estimation via error-correcting output coding. In: International conference on artificial intelligence and soft computing

    Google Scholar 

  • Koyejo OO, Natarajan N, Ravikumar PK, Dhillon IS (2014) Consistent binary classification with generalized performance metrics. In: Advances in neural information processing systems (NIPS’14), pp 2744–2752

    Google Scholar 

  • Kull M, Flach P (2015) Novel decompositions of proper scoring rules for classification: score adjustment as precursor to calibration. In: Machine learning and knowledge discovery in databases (ECML-PKDD’15). Springer, pp 68–85

    Google Scholar 

  • Niculescu-Mizil A, Caruana R (2005) Predicting good probabilities with supervised learning. In: Proceedings of 22nd international conference on machine learning (ICML’05), pp 625–632

    Google Scholar 

  • Platt J (2000) Probabilities for SV machines. In: Smola A, Bartlett P, Schölkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge, pp 61–74

    Google Scholar 

  • Provost F, Domingos P (2003) Tree induction for probability-based ranking. Mach Learn 52(3):199–215

    Article  MATH  Google Scholar 

  • Zadrozny B (2001) Reducing multiclass to binary by coupling probability estimates. In: Advances in neural information processing systems (NIPS’01), pp 1041–1048

    Google Scholar 

  • Zadrozny B, Elkan C (2001) Obtaining calibrated probability estimates from decision trees and naive bayesian classifiers. In: Proceedings of 18th international conference on machine learning (ICML’01), pp 609–616

    Google Scholar 

  • Zadrozny B, Elkan C (2002) Transforming classifier scores into accurate multiclass probability estimates. In: Proceedings of 8th international conference on knowledge discovery and data mining (KDD’02). ACM, pp 694–699

    Google Scholar 

  • Zhao M-J, Edakunni N, Pocock A, Brown G (2013) Beyond Fano’s inequality: bounds on the optimal F-score, BER, and cost-sensitive risk and their implications. J Mach Learn Res 14(1):1033–1090

    MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter A. Flach .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media New York

About this entry

Cite this entry

Flach, P.A. (2017). Classifier Calibration. In: Sammut, C., Webb, G.I. (eds) Encyclopedia of Machine Learning and Data Mining. Springer, Boston, MA. https://doi.org/10.1007/978-1-4899-7687-1_900

Download citation

Publish with us

Policies and ethics