Abstract
ROC (Receiver Operating Characteristic) has been used as a tool for the analysis and evaluation of two-class classifiers, even the training data embraces unbalanced class distribution and cost-sensitiveness. However, ROC has not been effectively extended to evaluate multi-class classifiers. In this paper, we proposed an effective way to deal with multi-class learning with ROC analysis. An EMAUC algorithm is implemented to transform a multi-class training set into several two-class training sets. Classification is carried out with these two-class training sets. Empirical results demonstrate that the classifiers trained with the proposed algorithm have competitive performance for unbalanced distribution and cost-sensitive domains.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Fawcett, T., Provost, F.: Adaptive Fraud Detection. Data Mining and Knowledge Discovery, 291–316 (1997)
Lusted, L.B.: Logical Analysis in Roentgen Diagnosis. Radiology 74, 178–193 (1960)
Dietterich, T.G., Bakiri, G.: Solving Multiclass Learning Problems Via Error Correcting Output Codes. Journal of Artificial Intelligence Research 2, 263–286 (1995)
ROCon, http://www.cs.bris.ac.uk/Research/MachineLearning/rocon
Merz, C.J., Murphy, P.M., Aha, D.W.: UCI repository of machine learning databases, University of California, Irvine (1998), Available: http://www.ics.uci.edu/~mlearn/MLRepository.html
Swets, J.A., Dawes, R.M., Monahan, J.: Better Decisions through Science. Scientific American (2000)
Fawcett, T.: ROC Graphs: Notes and Practical Considerations for Researchers. Machine Learning (2004)
Drummond, C., Holte, R.C.: What ROC Curves Can’t Do (and Cost Curves Can). In: Proceedings of the ROC Analysis in Artificial Intelligence, 1st International Workshop, pp. 19–26 (2004)
Ling, C.X., Huang, J., Zhang, H.: AUC: a Better Measure than Accuracy in Comparing Learning Algorithms. In: Canadian Conference on AI (2003)
Mossman, D.: Three-way ROCs. Medical Decision Making 19(1), 78–89 (1999)
Ferri, C., Flach, P.A., Hernandez-Orallo, J.: Learning Decision Trees Using the Area Under the ROC Curve. In: Proceedings of the Nineteenth International Conference on Machine Learning ICML, pp. 139–146 (2002)
Ferri, C., Hernndez-Orallo, J., Salido, M.A.: Volume Under the ROC Surface for Multi-class Problems. In: Proceedings of 14th European Conference on Machine Learning, ECML (2003)
Ferri, C., Hernndez-Orallo, J., Salido, M.A.: Volume Under the ROC Surface for Multi-class Problems. Exact Computation and Evaluation of Approximations. 2003, Univ. Politecnica de Valencia: Valencia. 1-40. DSIC. Univ. Politc. Valncia (2003)
Hand, D.J., Till, R.J.: A Simple Generalization of the Area Under the ROC Curve for Multiple Class Classification Problems. Machine Learning 45(2), 171–186 (2001)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Bradley, A.P.: The Use of the Area under the ROC Curve in the Evaluation of Machine Learning Algorithms. Pattern Recognition 30, 1145–1159 (1997)
Flach, P.A.: The Geometry of ROC Space: Using ROC Isometrics to Understand Machine Learning Metrics. In: Proceedings of the International Conference on Machine Learning (2003)
Provost, F.J., Fawcett, T.: Analysis and Visualization of Classifier Performance: Comparison under Imprecise Class and Cost Distributions. In: Knowledge Discovery and Data Mining, pp. 43–48 (1997)
Ling, C.X., Huang, J., Zhang, H.: AUC: a Statistically Consistent and More Discriminating Measure Than Accuracy. In: Proceedings of 18th International Conference on Artificial Intelligence (IJCAI 2003), pp. 329–341 (2003)
Huang, J., Lu, J., Ling, C.X.: Comparing Natives Bayes, Decision Trees, and SVM using Accuracy and AUC. In: Proceedings of European Conference on Data Mining (ICDML 2003) (2003)
Lachicle, N., Flach, P.: Improving Accuracy and Cost of Two-Class and Multi-Class Probabilistic Classifiers Using ROC Curves. In: Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003) (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, X., Jiang, C., Luo, Mj. (2006). Training Classifiers for Unbalanced Distribution and Cost-Sensitive Domains with ROC Analysis. In: Hoffmann, A., Kang, Bh., Richards, D., Tsumoto, S. (eds) Advances in Knowledge Acquisition and Management. PKAW 2006. Lecture Notes in Computer Science(), vol 4303. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11961239_8
Download citation
DOI: https://doi.org/10.1007/11961239_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-68955-3
Online ISBN: 978-3-540-68957-7
eBook Packages: Computer ScienceComputer Science (R0)