Abstract
This paper presents a novel reject rule for support vector classifiers, based on the receiver operating characteristic (ROC) curve. The rule minimises the expected classification cost, defined on the basis of classification and the error costs for the particular application at hand. The rationale of the proposed approach is that the ROC curve of the SVM contains all of the necessary information to find the optimal threshold values that minimise the expected classification cost. To evaluate the effectiveness of the proposed reject rule, a large number of tests has been performed on several data sets, and with different kernels. A comparison technique, based on the Wilcoxon rank sum test, has been defined and employed to provide the results at an adequate significance level. The experiments have definitely confirmed the effectiveness of the proposed reject rule.
Similar content being viewed by others
References
Vapnik V (1995) The nature of statistical learning theory. Springer, Berlin Heidelberg New York
Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297
Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Fumera G, Roli F (2002) Support vector machines with embedded reject option. In: Lee S, Verri A (eds) Pattern recognition with support vector machines. Lecture notes in computer science, vol 2388. Springer, Berlin Heidelberg New York, pp 68–82
Mukherjee S, Tamayo P, Slonim D, Verri A, Golub T, Mesirov JP, Poggio T (1998) Support vector machine classification of microarray data. AI memo 1677, Massachusetts Institute of Technology
Platt JC (2000) Probabilistic outputs for support vector machines and comparisons to regularized likelihood methods. In: Smola AJ, Bartlett PL, Schölkopf B, Schurmans D (eds) Advances in large margin classifiers, MIT Press, pp 61–74
Kwok JT (1999) Moderating the outputs of support vector machine classifiers. IEEE T Neural Networ 10:1018–1031
Chow CK (1957) An optimum character recognition system using decision functions. IRE T Elec Comp 6:247–254
Chow CK (1970) On optimum recognition error and reject tradeoff. IEEE T Inform Theory 10:41–46
Dubuisson B, Masson M (1993) A statistical decision rule with incomplete knowledge about classes. Pattern Recogn 26:155–165
Muzzolini R, Yang Y-H, Pierson R (1998) Classifier design with incomplete knowledge. Pattern Recogn 31:345–369
Cordella LP, De Stefano C, Tortorella F, Vento M (1995) A method for improving classification reliability of multilayer perceptrons. IEEE T Neural Networ 6:1140–1147
Tortorella F (2000) An optimal reject rule for binary classifiers. In: Ferri FJ, Inesta JM, Amin A, Pudil P (eds) Advances in pattern recognition. Lecture notes in computer science, vol 1876. Springer, Berlin Heidelberg New York, pp 611–620
Egan JP (1975) Signal detection theory and ROC analysis. Series in cognition and perception. Academic Press, New York
Metz CE (1986) ROC methodology in radiologic imaging. Invest Radiol 21:720–733
Bradley AP (1997) The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn 30:1145–1159
Provost F, Fawcett T (1997) Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions. In: Proceedings of the 3rd international conference on knowledge discovery and data mining, Newport Beach, CA, August 1997
Shawe-Taylor J (1998) Classification accuracy based on observed margin. Algorithmica 22:157–172
Zweig MH, Campbell G (1993) Receiver-operating characteristic plots: A fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577
Provost F, Fawcett T (2001) Robust classification for imprecise environments. Mach Learn 42:203–231
Fawcett T (2003) ROC graphs: Notes and practical considerations for data mining researchers. HP Labs technical report HPL-2003–4
Duin RPW (2000) PRTools: A Matlab toolbox for pattern recognition. Pattern Recognition Group, Delft University of Technology,http://www.ph.tn.tudelft.nl/prtools
Blake C, Keogh E, Merz CJ (1998) UCI repository of machine learning databases. Department of Information and Computer Science, University of California, Irvine, CA,http://www.ics.uci.edu/~mlearn/MLRepository.html
Joachims T (1999) Making large-scale SVM learning practical. In: Schölkopf B, Burges CJC, Smola AJ (eds) Advances in kernel methods: Support vector learning. MIT Press, pp 169–184
Walpole RE, Myers RH, Myers SL (1998) Probability and statistics for engineers and scientists, 6th edn. Prentice Hall, London
Margineantu DD, Dietterich TG (2000) Bootstrap methods for the cost-sensitive evaluation of classifiers. In: Proceedings of the 7th international conference on machine learning (ICML-2000), Stanford, CA, June/July 2000, pp 582-590
Acknowledgements
This work has been partially supported by the MIUR (Italian Ministry of University and Research) under PRIN 2003 project, A system for computer aided analysis and remote access of mammographic images for early diagnosis of breast cancer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tortorella, F. Reducing the classification cost of support vector classifiers through an ROC-based reject rule. Pattern Anal Applic 7, 128–143 (2004). https://doi.org/10.1007/s10044-004-0209-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10044-004-0209-2