Using Multiple SVM Models for Unbalanced Credit Scoring Data Sets

Schebesch, Klaus B.; Stecking, Ralf

doi:10.1007/978-3-540-78246-9_61

Klaus B. Schebesch⁵ &
Ralf Stecking⁶

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

6022 Accesses
2 Citations

Abstract

Owing to the huge size of the credit markets, even small improvements in classification accuracy might considerably reduce effective misclassification costs experienced by banks. Support vector machines (SVM) are useful classification methods for credit client scoring. However, the urgent need to further boost classification performance as well as the stability of results in applications leads the machine learning community into developing SVM with multiple kernels and many other combined approaches. Using a data set from a German bank, we first examine the effects of combining a large number of base SVM on classification performance and robustness. The base models are trained on different sets of reduced client characteristics and may also use different kernels. Furthermore, using censored outputs of multiple SVM models leads to more reliable predictions in most cases. But there also remains a credit client subset that seems to be unpredictable. We show that in unbalanced data sets, most common in credit scoring, some minor adjustments may overcome this weakness. We then compare our results to the results obtained earlier with more traditional, single SVM credit scoring models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

AKBANI, R., KWEK, S. and JAPKOWICZ, N. (2004): Applying Support Vector Machines to Imbalanced Datasets. In: Machine Learning: ECML 2004, Proceedings Lecture Notes in Computer Science 3201. 39-50.
Google Scholar
DUIN, R.P.W. and TAX, D.M.J. (2000): Experiments with Classifier Combining Rules. In: Kittler, J. and Roli, F. (Eds.): MCS 2000, LNCS 1857. Springer, Berlin, 16-19.
Google Scholar
KUNCHEVA, L.I. (2004): Combining Pattern Classifiers: Methods and Algorithms. Wiley 2004.
Google Scholar
KOLTCHINSKII, V., PANCHENKO, D. and LOZANO, F. (2004): Bounding the generaliza-tion error of convex combinations of classifiers: balancing the dimensionality and the margins. From: arXiv:math PR/0405345 posted on May 19th 2004.
Google Scholar
KUBAT, M. and MATWIN, S. (1997): Addressing the Curse of Imbalanced Training Sets: One-Sided Selection. In: Proceedings of the 14th International Conference on Machine Learning. 179-186.
Google Scholar
SCHEBESCH, K.B. and STECKING, R. (2005a): Support Vector Machines for Credit Scor-ing: Extension to Non Standard Cases. In: Baier, D. and Wernecke, K.-D. (Eds.): Innova-tions in Classification, Data Science and Information Systems. Springer, Berlin, 498-505.
Chapter Google Scholar
SCHEBESCH, K.B. and STECKING, R. (2005b): Support vector machines for credit appli-cants: detecting typical and critical regions. Journal of the Operational Research Society, 56 (9),1082-1088.
Article MATH Google Scholar
SCHEBESCH, K.B. and STECKING, R. (2007): Selecting SVM Kernels and Input Variable Subsets in Credit Scoring Models. In: Decker, R. and Lenz, H.-J. (Eds.): Advances in Data Analysis. Springer, Berlin, 179-186.
Chapter Google Scholar
STECKING, R. and SCHEBESCH, K.B. (2006): Comparing and Selecting SVM-Kernels for Credit Scoring. In: Spiliopoulou, M., Kruse, R., Borgelt, C., Nürnberger, A., Gaul, W. (Eds.): From Data and Information Analysis to Knowledge Engineering. Springer, Berlin, 542-549.
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Economics, University “Vasile Goldiş”, Arad, Romania
Klaus B. Schebesch
Faculty of Economics, University of Oldenburg, D-26111, Oldenburg, Germany
Ralf Stecking

Authors

Klaus B. Schebesch
View author publications
You can also search for this author in PubMed Google Scholar
Ralf Stecking
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science and Institute of Business Economics and Information Systems, University of Hildesheim, Marienburgerplatz 22, 31141, Hildesheim, Germany
Christine Preisach
Lehrstuhl für Mustererkennung und Bildverarbeitung, Universität Freiburg, Gebäude 052, 79110, Freiburg i. Br, Germany
Hans Burkhardt
Institute of Computer Science and Institute of Business Economics and Information Systems, Marienburgerplatz 22, 31141, Hildesheim, Germany
Lars Schmidt-Thieme
Fakultät für Wirtschaftswissenschaften, Lehrstuhl für Betriebswirtschaftslehre, insbes. Marketing, Universitätsstraße 25, 33615, Bielefeld, Germany
Reinhold Decker

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schebesch, K.B., Stecking, R. (2008). Using Multiple SVM Models for Unbalanced Credit Scoring Data Sets. In: Preisach, C., Burkhardt, H., Schmidt-Thieme, L., Decker, R. (eds) Data Analysis, Machine Learning and Applications. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-78246-9_61

Download citation

DOI: https://doi.org/10.1007/978-3-540-78246-9_61
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-78239-1
Online ISBN: 978-3-540-78246-9
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics