Skip to main content
Log in

Confidence Transformation for Combining Classifiers

  • Original Article
  • Published:
Pattern Analysis and Applications Aims and scope Submit manuscript

Abstract

This paper investigates a number of confidence transformation methods for measurement-level combination of classifiers. Each confidence transformation method is the combination of a scaling function and an activation function. The activation functions correspond to different types of confidences: likelihood (exponential), log-likelihood, sigmoid, and the evidence combination of sigmoid measures. The sigmoid and evidence measures serve as approximates to class probabilities. The scaling functions are derived by Gaussian density modeling, logistic regression with variable inputs, etc. We test the confidence transformation methods in handwritten digit recognition by combining variable sets of classifiers: neural classifiers only, distance classifiers only, strong classifiers, and mixed strong/weak classifiers. The results show that confidence transformation is efficient to improve the combination performance in all the settings. The normalization of class probabilities to unity of sum is shown to be detrimental to the combination performance. Comparing the scaling functions, the Gaussian method and the logistic regression perform well in most cases. Regarding the confidence types, the sigmoid and evidence measures perform well in most cases, and the evidence measure generally outperforms the sigmoid measure. We also show that the confidence transformation methods are highly robust to the validation sample size in parameter estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.

Similar content being viewed by others

References

  1. Mandler E, Schürman J. Combining the classification results of independent classifiers based on the Dempster-Shafer theory of evidence. In: Gelsema ES, Kanal LN (eds). Pattern Recognition and Artificial Intelligence. Elsevier, 1988, pp.381–393.

  2. Xu L, Krzyzak A, Suen CY. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. System, Man, and Cybernetics 1992; 22(3): 418–435.

    Google Scholar 

  3. Ho TK, Hull J, Srihari SN. Decision combination in multiple classifier systems. IEEE Trans. Pattern Analysis and Machine Intelligence 1994; 16(1): 66–75.

    Google Scholar 

  4. Kittler J, Hatef M, Duin RPW, Matas J. On combining classifiers. IEEE Trans. Pattern Analysis and Machine Intelligence 1998; 20(3): 226–239.

    Google Scholar 

  5. Suen CY, Lam L. Multiple classifier combination methodologies for different output levels. In: Kittler J, Roli F (eds). Multiple Classifier Systems, LNCS 1857. Springer, 2000, pp.52–66.

  6. Rahman AF, Fairhurst MC. A novel confidence-based framework for multiple expert decision fusion. In: Carter N, Nixon MS (eds). Proc. 9th British Machine Vision Conference, 1998.

  7. Bengio S, Marcel C, Marcel S, Mariethoz J. Confidence measures for multimodal identity identification. Information Fusion 2002; 3(4): 267–276.

    Google Scholar 

  8. Duin RPW. The combining classifiers: to train or not to train. In: Proc. 16th International Conference on Pattern Recognition, Vol.2. Quebec, Canada, 2002, pp.765–770.

  9. Liu CL, Nakagawa M. Precise candidate selection for large character set recognition by confidence evaluation. IEEE Trans. Pattern Analysis and Machine Intelligence 2000; 22(6): 636–642.

    Google Scholar 

  10. Ruck DW, Rogers SK, Kabrisky M, Oxley ME, Suter BW. The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Trans. Neural Networks 1990; 1(4): 296–298.

    Google Scholar 

  11. Richard MD, Lippmann RP. Neural network classifiers estimate Bayesian a posteriori probabilities. Neural Computation 1991; 4:461–483.

    Google Scholar 

  12. Duda RO, Hart PE, Stork DG, Pattern Classification, 2nd edition. Wiley-Interscience, 2001.

  13. Cordella LP, Foggia P, Sansone C, Tortorella F, Vento M. Reliability parameters to improve combination strategies in multi-expert systems. Pattern Analysis and Applications 1999; 2(3): 205–214.

    Google Scholar 

  14. Atukorale AS, Suganthan PN. Combining classifiers based on confidence values. In Proc. 5th International Conference on Document Analysis and Recognition. Bangalore, India, 1999, pp.37–40.

  15. Lin X, Ding X, Chen M, Zhang R, Wu Y. Adaptive confidence transform based classifier combination for Chinese character recognition. Pattern Recognition Letters 1998; 19:975–988.

    Google Scholar 

  16. Denker JS, LeCun Y. Transforming neural-net output levels to probability distribution. In: Lippmann RP, Moody JE, Touretzky DS (eds). Advances in Neural Information Processing 3. Morgan Kauffman, 1991, pp.853–859.

  17. Hoekstra A, Tholen SA, Duin RPW. Estimating the reliability of neural network classification. In Proc. International Conference on Artificial Neural Networks. Bochum, Germany, 1996, pp.53–58.

  18. Duin RPW, Tax DMJ. Classifier conditional posterior probabilities. In: Amin A, Dori D, Pudil P, Fremman H (eds). Advances in Pattern Recognition, LNCS 1451. Springer, 1998, pp.611–619.

  19. Gillick L, Ito Y, Young J. A probabilistic approach to confidence estimation and evaluation. In Proc. International Conference on Acoustics, Speech, and Signal Processing, vol.2. Munich, Germany, 1997, pp.879–882.

  20. Platt J. Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods, In: Smola AJ, Bartlett P, Scholkopf D, Schuurmanns D (eds). Advances in Large Margin Classifiers. MIT Press, 1999.

  21. Gorski N. Practical combination of multiple classifiers, In: Downton AC, Impedovo S (eds), Progress of Handwriting Recognition. World Scientific, 1997.

  22. Wei W, Leen TK, Barnard E. A fast histogram-based postprocessor that improves posterior probability estimates. Neural Computation 1999; 11(5): 1235–1248.

    Google Scholar 

  23. Schürmann J, Pattern Classification: A United View of Statistical and Neural Approaches. Wiley-Interscience, 1996.

  24. Hao H, Liu CL, Sako H. Confidence evaluation for combining diverse classifiers. In Proc. 7th International Conference on Document Analysis and Recognition. Edinburgh, Scotland, 2003, pp.760–764.

  25. Hashem S. Optimal linear combinations of neural networks. Neural Networks 1997; 10(4): 599–614.

    Google Scholar 

  26. Ueda N. Optimal linear combination of neural networks for improving classification performance. IEEE Trans. Pattern Analysis and Machine Intelligence 2000; 22(2): 207–215.

    Google Scholar 

  27. Lee DS, Srihari SN. A theory of classifier combination: the neural network approach. In Proc. 3rd International Conference on Document Analysis and Recognition. Montreal, 1995, pp.42–45.

    Google Scholar 

  28. Duin RPW, Tax DMJ. Experiments with classifier combining rules. In: Kittler J, Roli F (eds). Multiple Classifier Systems, LNCS 1857. Springer, 2000, pp.16-29.

  29. Kuncheva LI, Bezdek JC, Duin RPW. Decision templates for multiple classifier fusion: an experimental comparison. Pattern Recognition 2001; 34(2): 299–314.

    Google Scholar 

  30. Shafer G, A Mathematical Theory of Evidence. Princeton Univ. Press, 1976.

  31. Barnett JA. Computational methods for a mathematical theory of evidence. In Proc. 7th International Joint Conference on Artificial Intelligence. Vancouver, Canada, 1981, pp.868–875.

  32. Rogova G. Combining the results of several neural network classifiers. Neural Networks 1994; 7(5): 777–781.

    Google Scholar 

  33. Tomai CI, Srihari SN. Combination of type III digit recognizers using the Dempster-Shafer theory of edivence. In Prof. 7th International Conference on Document Analysis and Recognition. Edinburgh, 2003, pp.854–858.

  34. Jain AK, Prabhakar S, Chen S. Combining multiple matches for a high security fingerprint verification system. Pattern Recognition Letters 1999; 20(11–13): 1371–1379.

    Google Scholar 

  35. Wu L, Oviatt SL, Cohen PR. From members to teams to committee—a robust approach to gestural and multimodal recognition. IEEE Trans. Neural Networks 2002; 13(4): 972–982.

    Google Scholar 

  36. Bridle JS. Probabilistic interpretation of feedforward classification network outputs, with relationships to statistical pattern recognition. In: Fogelman-Soulie, Herault (eds). Neurocomputing: Algorithms, Architectures and Applications. Springer, 1990, pp.227–236.

  37. Robbins H, Monro S. A stochastic approximation method. Annals of Mathematical Statistics 1951; 22:400–407.

    Google Scholar 

  38. Liu CL, Sako H, Fujisawa H. Performance evaluation of pattern classifiers for handwritten character recognition. Int. J. Document Analysis and Recognition 2002; 4(3): 191–204.

    Google Scholar 

  39. Liu CL, Nakashima K, Sako H, Fujisawa H. Handwritten digit recognition: benchmarking of state-of-the-art techniques. Pattern Recognition 2003; 36(10): 2271–2285.

    Google Scholar 

  40. Liu CL, Nakashima K, Sako H, Fujisawa H. Handwritten digit recognition: investigation of normalization and feature extraction techniques. Pattern Recognition 2003; 37(2):265–279

    Google Scholar 

  41. Hamanaka M, Yamada K, Tsukumo J. Normalization-cooperated feature extraction method for handprinted Kanji character recognition. In Proc. 3rd International Workshop on Frontiers of Handwriting Recognition. Buffalo, NY, 1993, pp.343-348.

  42. Liu CL, Liu YJ, Dai RW. Preprocessing and statistical/structural feature extraction for handwritten numeral recognition. In: Downton AC, Impedovo S (eds). Progress of Handwriting Recognition. World Scientific, 1997, pp.161-168.

  43. Liu CL, Koga M, Sako H, Fujisawa H. Aspect ratio adaptive normalization for handwritten character recognition. In: Tan T, Shi Y, Gao W (eds). Advances in Multimodal Interfaces—ICMI2000, LNCS 1948. Springer, 2000, pp.418–425.

  44. Bishop CM, Neural Networks for Pattern Recognition. Claderon Press, Oxford, 1995.

  45. Kreßel U, Schürmann J. Pattern classification techniques based on function approximation. In: Bunke H, Wang PSP (eds). Handbook of Character Recognition and Document Image Analysis. World Scientific, 1997, pp.49–78.

  46. Liu CL, Nakagawa M. Evaluation of prototype learning algorithms for nearest neighbor classifier in application to handwritten character recognition. Pattern Recognition 2001; 34(3): 601–615.

    Google Scholar 

  47. Liu CL, Sako H, Fujisawa H. Learning quadratic discriminant function for handwritten character recognition. In Proc. 16th International Conference on Pattern Recognition, vol.4. Quebec, Canada, 2002, pp.44–47.

  48. Grother PJ, NIST special database 19: handprinted forms and characters database. Technical report and CDROM, 1995.

Download references

Acknowledgements

The work of Hongwei Hao was done when he was working at the Hitachi Central Research Laboratory. The authors would thank Kazuki Nakashima and Ryuji Mine for providing the datasets.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Cheng-Lin Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, CL., Hao, H. & Sako, H. Confidence Transformation for Combining Classifiers. Pattern Anal Applic 7, 2–17 (2004). https://doi.org/10.1007/s10044-003-0199-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10044-003-0199-5

Keywords

Navigation