Abstract
In this paper we compare thirteen different methods to obtain multi-class probability estimates in view of two medical case studies. The basic classification method used to implement all methods are least squares support vector machine (LS-SVM) classifiers. Results indicate that multi-class kernel logistic regression performs very well, together with a method based on ensembles of nested dichotomies. Also, a Bayesian LS-SVM method imposing sparseness performed very well for methods that combine binary probabilities into multi-class probabilities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Allwein, E.L., Schapire, R.E., Singer, Y.: Reducing multiclass to binary: a unifying approach for margin classifiers. J. Mach. Learn. Res. 1, 113–141 (2000)
Suykens, J.A.K., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9, 293–300 (1999)
Suykens, J.A.K., Van Gestel, T., De Brabanter, J., De Moor, B., Vandewalle, J.: Least squares support vector machines. World Scientific, Singapore (2002)
Van Gestel, T., Suykens, J.A.K., Lanckriet, G., Lambrechts, A., De Moor, B., Vandewalle, J.: Bayesian framework for least-squares support vector machine classifiers, Gaussian processes, and kernel Fisher discriminant analysis. Neural Comput. 14, 1115–1147 (2002)
Platt, J.: Probabilistic outputs for support vector machines and comparison to regularized likelihood methods. In: Smola, A.J., Bartlett, P.L., Scholkopf, B., Schuurmans, D. (eds.) Advances in Large Margin Classifiers, pp. 61–74. MIT Press, Cambridge (2000)
Lin, H.-T., Lin, C.-J., Weng, R.C.: A note on Platt’s probabilistic outputs for support vector machines. Technical Report, Department of Computer Science, National Taiwan University (2003)
Zadrozny, B., Elkan, C.: Transforming classifier scores into accurate multiclass probability estimates. In: Proc. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 694–699 (2002)
Cawley, G.C.: Leave-one-out cross-validation based model selection criteria for weighted LS-SVMs. In: Proc. 19th International Joint Conference on Neural Networks, pp. 2970–2977 (2006)
MacKay, D.J.C.: Probable networks and plausible predictions - a review of practical Bayesian methods for supervised neural networks. Netw.-Comput. Neural. Syst. 6, 469–505 (1995)
Lu, C., Van Gestel, T., Suykens, J.A.K., Van Huffel, S., Vergote, I., Timmerman, D.: Preoperative prediction of malignancy of ovarian tumors using least squares support vector machines. Artif. Intell. Med. 28, 281–306 (2003)
Ayer, M., Brunk, H., Ewing, G., Reid, W., Silverman, E.: An empirical distribution function for sampling with incomplete information. Ann. Math. Stat. 26, 641–647 (1955)
Refregier, P., Vallet, F.: Probabilistic approach for multiclass classification with neural networks. In: Proc. International Conference on Artificial Networks, pp. 1003–1007 (1991)
Frank, E., Kramer, S.: Ensembles of nested dichotomies for multi-class problems. In: Proc. 21st International Conference on Machine Learning, vol. 39 (2004)
Price, D., Knerr, S., Personnaz, L., Dreyfus, G.: Pairwise neural network classifiers with probabilistic outputs. In: Tesauro, G., Touretzky, D.S., Leen, T.K. (eds.) Neural Information Processing Systems, vol. 7, pp. 1109–1116. MIT Press, Cambridge (1995)
Hastie, T., Tibshirani, R.: Classification by pairwise coupling. Ann. Stat. 26, 451–471 (1998)
Wu, T.-F., Lin, C.-J., Weng, R.C.: Probability estimates for multi-class classification by pairwise coupling. J. Mach. Learn. Res. 5, 975–1005 (2004)
Huang, T.-K., Weng, R.C., Lin, C.-J.: Generalized Bradley-Terry models and multi-class probability estimates. J. Mach. Learn. Res. 7, 85–115 (2006)
Duan, K., Keerthi, S.S., Chu, W., Shevade, S.K., Poo, A.N.: Multi-category classification by soft-max combination of binary classifiers. In: Windeatt, T., Roli, F. (eds.) MCS 2003. LNCS, vol. 2709, pp. 125–134. Springer, Heidelberg (2003)
Karsmakers, P., Pelckmans, K., Suykens, J.A.K.: Multi-class kernel logistic regression: a fixed-size implementation. Accepted for presentation at the 20th International Joint Conference on Neural Networks (2007)
Zhu, J., Hastie, T.: Kernel logistic regression and the import vector machine. J. Comput. Graph. Stat., 185–205 (2005)
Timmerman, D., Valentin, L., Bourne, T.H., Collins, W.P., Verrelst, H., Vergote, I.: Terms, definitions and measurements to describe the sonographic features of adnexal tumors: a consensus statement from the International Ovarian Tumor Analysis (IOTA) group. Ultrasound. Obstet. Gynecol. 16, 500–505 (2000)
Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proc. 14th International Joint Conference on Artificial Intelligence, pp. 1137–1143 (1995)
Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143, 29–36 (1982)
Brazdil, P.B., Soares, C.: A comparison of ranking methods for classification algorithm selection. In: López de Mántaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 63–74. Springer, Heidelberg (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Van Calster, B. et al. (2007). Comparing Methods for Multi-class Probabilities in Medical Decision Making Using LS-SVMs and Kernel Logistic Regression. In: de Sá, J.M., Alexandre, L.A., Duch, W., Mandic, D. (eds) Artificial Neural Networks – ICANN 2007. ICANN 2007. Lecture Notes in Computer Science, vol 4669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74695-9_15
Download citation
DOI: https://doi.org/10.1007/978-3-540-74695-9_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74693-5
Online ISBN: 978-3-540-74695-9
eBook Packages: Computer ScienceComputer Science (R0)