Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis

Liberati, Caterina; Camillo, Furio; Saporta, Gilbert

doi:10.1007/s11634-015-0213-y

Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis

Regular Article
Published: 04 July 2015

Volume 11, pages 121–138, (2017)
Cite this article

Advances in Data Analysis and Classification Aims and scope Submit manuscript

Caterina Liberati¹,
Furio Camillo² &
Gilbert Saporta³

821 Accesses
11 Citations
Explore all metrics

Abstract

Due to the recent financial turmoil, a discussion in the banking sector about how to accomplish long term success, and how to follow an exhaustive and powerful strategy in credit scoring is being raised up. Recently, the significant theoretical advances in machine learning algorithms have pushed the application of kernel-based classifiers, producing very effective results. Unfortunately, such tools have an inability to provide an explanation, or comprehensible justification, for the solutions they supply. In this paper, we propose a new strategy to model credit scoring data, which exploits, indirectly, the classification power of the kernel machines into an operative field. A reconstruction process of the kernel classifier is performed via linear regression, if all predictors are numerical, or via a general linear model, if some or all predictors are categorical. The loss of performance, due to such approximation, is balanced by better interpretability for the end user, which is able to order, understand and to rank the influence of each category of the variables set in the prediction. An Italian bank case study has been illustrated and discussed; empirical results reveal a promising performance of the introduced strategy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature Selection for Credit Risk Classification

An Ensemble Wrapper Feature Selection for Credit Scoring

Credit Scoring Based on Kernel Matching Pursuit

Notes

A better estimation of the inertia has been proposed by Greenacre (1984) who suggested to evaluate the percentage of inertia relative to the average inertia of the off-diagonal blocks of the Burt matrix. The average inertia, can be computed as:
$$\begin{aligned} \mathcal {\bar{I}}=\frac{m}{m-1}\left( \sum _{l} \lambda _{l}^{2}-\frac{j-m}{m}\right) ^{2} \end{aligned}$$
(3)
where j is the sum of the levels of the nominal variables.
The choice of employing the same window width for all the discriminant function allows a competitive comparison among different models. Alternative values of $\delta $ has been applied $\delta =5,10,20$ but they produced the same rank in terms of good prediction.
In order to preserve the ease of interpretation we choose to not include in the multiple regression any interaction among original variables.
Test data is a random set of 4997 companies, sampled among all the instances not included in the training set.

References

Abdou H, Pointon J, El Masry A (2008) Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Syst Appl 35:1275–1292
Article Google Scholar
Akkoç S (2012) An empirical comparison of conventional techniques, neural networks and the three stage hybrid adaptive neuro fuzzy inference system (anfis) model for credit scoring analysis: The case of Turkish credit card data. Eur J Oper Res 222:168–178
Article Google Scholar
Altman E, Sabato G (2007) Modeling credit risk for SMES: evidence from U.S. market. ABACUS 43(3):332–357
Article Google Scholar
Altman E, Sabato G, Wilson N (2010) The value of non-financial information in small and medium-sized enterprise risk management. J Credit Risk 6(2):95–127
Article Google Scholar
Angelini E, Di Tollo G, Roli A (2008) A neural network approach for credit risk evaluation. Q Rev Econ Finance 48:733–755
Article Google Scholar
Back B, Laitinen T, Sere K, van Wezel M (1996) Choosing bankruptcy predictors using discriminant analysis, logit analysis, and genetic algorithms. In: Proceedings of the 1st international meeting on artificial intelligence in accounting, finance and tax, pp 337–356
Baesens B, Van Gestel T, Viaene S, Stepanova M, Suykens J, Vanthienen J (2003) Benchmarking state-of-the-art classification algorithms for credit scoring. J Oper Res Soc 54(6):627–635
Article MATH Google Scholar
Barakat N, Bradley AP (2010) Evaluating consumer loans using neural networks. Neurocomputing 74:178–190
Article Google Scholar
Basel I (2011) A global regulatory framework for more resilient banks and banking systems
Baudat G, Anouar F (2000) Generalized discriminant analysis using a kernel approach. Neural Comput 12:2385–2404
Article Google Scholar
Benzécri J (1973) L’analyse des données, No. v. 2. L’analyse des données, Dunod
Benzécri JP (1979) Sur le calcul des taux d’inertie dans l’analyse d’un questionnaire, addendum et erratum à (bin. mult.). Cah Anal Données 4(3):377–378
Bozdogan H, Camillo F, Liberati C (2006) On the choice of the kernel function in kernel discriminant analysis using information complexity. In: Zani S, Cerioli A, Riani M, Vichi M (eds) Data analysis, classification and the forward search, studies in classification, data analysis, and knowledge organization. Springer, Berlin, pp 11–21
Chapter Google Scholar
Cawley GC, Talbot NLC (2003) Efficient leave-one-out cross-validation of kernel Fisher discriminant classifiers. Pattern Recognit 36(11):2585–2592
Article MATH Google Scholar
Chapelle O, Vapnik V, Bousquet O, Mukherjee S (2002) Choosing multiple parameters for support vector machines. Mach Learn 46:131–159
Article MATH Google Scholar
Cunningham P, Doyle D, Loughrey J (2003) An evaluation of the usefulness of case-based explanation. In: Langley P (ed) Proceedings of the fifth international conference on case-based reasoning (ICCBR 2003). Morgan Kaufmann, New York, pp 122–130
Derelioğlu G, Gürgen F (2011) Knowledge discovery using neural approach for SME+S credit risk analysis problem in Turkey. Expert Syst Appl 38:9313–9318
Article Google Scholar
Duda RO, Hart P, Stork D (2000) Pattern classification. Wiley, New York
Friedman JH (1989) Regularized discriminant analysis. J Am Stat Assoc 84(405):165–175
Article MathSciNet Google Scholar
Gönen GB, Gönen M, Gürgen F (2012) Credit rating analysis with support vector machines and neural networks: a market comparative study. Expert Syst Appl 39:11709–11717
Greenacre MJ (1984) Theory and applications of correspondence analysis. Academic Press, London
MATH Google Scholar
Grunet J, Norden L, Weber M (2008) The role of non-financial factors in internal credit ratings. J Bank Finance 2:509–531
Hill P, Wilson N (2007) Predicting the insolvency of unlisted companies. In: Working paper, CMRC, Leeds University
Hosmer D, Lemeshow S (1989) Applied logistic regression. Wiley, New York
MATH Google Scholar
Huang CL, Wang CJ (2006) A GA-based feature selection and parameters optimization for support vector machines. Expert Syst Appl 31:231–240
Article Google Scholar
Huang Z, Chen H, Hsu CJ, Chen WH, Wu S (2004) Credit rating analysis with support vector machines and neural networks: a market comparative study. Decis Support Syst 37(4):543–558
Article Google Scholar
Huang YM, Hung C, Jiau HC (2006) Evaluation of neural networks and data mining methods on a credit assessment task for class imbalance problem. Nonlinear Anal Real World Appl 7:720–747
Article MathSciNet MATH Google Scholar
Huang CL, Chen MC, Wang CJ (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33:847–856
Article Google Scholar
Karush W (1939) Minima of functions of several variables with inequalities as side constraints. M.sc. thesis, University of Chicago
Khandani AE, Kim AJ, Lo AW (2010) Consumer credit-risk models via machine-learning algorithms. J Bank Finance 34:2767–2787
Article Google Scholar
Khashman A (2010) Neural networks for credit risk evaluation: investigation of different neural models and learning schemes. Expert Syst Appl 37:6233–6239
Article Google Scholar
Kim HS, Sohn SY (2010) Support vector machines for default prediction of SMES based on technology credit. Eur J Oper Res 201:838–846
Article MATH Google Scholar
Kuhn HW, Tucker AW (1951) Nonlinear programming. Proceedings of the second Berkeley symposium on mathematical statistics and probability. University of California Press, Berkeley, pp 481–492
Google Scholar
Lebart L, Morineau A, Warwick K (1984) Multivariate descriptive statistical analysis. Wiley, New York
Liberati C, Howe A, Bozdogan H (2009) Data adaptive simultaneous parameter and kernel selection in kernel discriminant analysis (KDA) using information complexity. J Pattern Recognit Res 4(1):119–132
Article Google Scholar
Malhotra R, Malhotra DK (2003) Evaluating consumer loans using neural networks. Omega 31:83–96
Article Google Scholar
Mavri M, Angelis V, Loannou G (2008) A two-stage dynamic credit scoring model based on customers profiles and time horizon. J Financ Serv Market 13(1):17–27
Article Google Scholar
Mays E (2004) Credit scoring for risk managers. The handbook for lenders, Thomson Learning
Google Scholar
Mercer J (1909) Functions of positive and negative type and their connection with the theory of integral equations. Philos Trans R Soc Lond
Mika S, Rätsch G, Weston J, Schölkopf B, Müller KR (1999) Fisher discriminant analysis with kernels. In: Neural networks for signal processing, vol IX. Proceedings of the 1999 IEEE signal processing society workshop, pp 41–48
Müller KR, Mika S, Rätsch G, Tsuda K, Schölkopf B (2001) An introduction to kernel-based learning algorithms. IEEE Trans Neural Netw 2:181–201
Article Google Scholar
Ong C, Huang J, Tzeng GH (2005) Building credit scoring models using genetic programming. Expert Syst Appl 29(1):41–47
Article Google Scholar
Peel M, Peel D (1989) A multi-logit approach to predicting corporate failure—some evidence for the UK corporate sector. Omega Int J Manag Sci 16(4):309–318
Article Google Scholar
Ping Y, Yongheng L (2011) Neighborhood rough set and SVM based hybrid credit scoring classifier. Expert Syst Appl 38:11300–11304
Press S (1975) Estimation of a normal covariance matrix. Santa Monica Rand Corporation, Santa Monica
Saporta G (1977) Une méthode et un programme d’analyse discriminante sur variables qualitatives. In: Diday E (ed) Analyse des Données et Informatique, INRIA, pp 201–210
Schölkopf B, Burges C, Smola AJ (1999a) Advances in kernel methods. MIT Press, Cambrige
MATH Google Scholar
Schölkopf B, Mika S, Burges C, Knirsch P, Müller KR, Rätsch G, Smola AJ (1999b) Input space versus feature space in kernel-based methods. IEEE Trans Neural Netw 5:1000–1017
Article Google Scholar
Shi Y, Wise M, Luo M, Lin Y (2001) Data mining in credit card portfolio management: a multiple criteria decision making approach. In: Koksalan M, Zionts S (eds) Multiple criteria decision making in the new millennium. Springer, Heidelberg, pp 427–436
Chapter Google Scholar
Shi Y, Peng Y, Xu W, Tang X (2002) Data mining via multiple criteria linear programming: applications in credit card portfolio management. Int J Inf Technol Decis Mak 1:131–151
Article Google Scholar
Smalz R, Conrad M (1994) Combining evolution with credit apportionment: a new learning algorithm for neural nets. Neural Netw 7(2):341–351
Article Google Scholar
Soares C, Brazdil PB (2006) Selecting parameters of SVM using meta-learning and kernel matrix-based meta-features. In: Proceedings of the 2006 ACM symposium on applied computing, ACM, New York, SAC ’06, pp 564–568. doi:10.1145/1141277.1141408
Suykens J, Vandewalle J (1999) Least squares support vector machine classifiers. Neural Process Lett 9(3):293–300
Article MATH Google Scholar
Suykens JAK, Van Gestel T, De Brabanter J, De Moor B, Vandewalle J (2002) Least squares support vector machines. World Scientific, Singapore
Book MATH Google Scholar
Thomaz C, Boardman J, Hill D, Hajnal J, Edwards D, Rutherford M, Gillies D, Rueckert D (2004) Using a maximum uncertainty LDA-based approach to classify and analyse MR brain images. Medical image computing and computer-assisted intervention MICCAI 2004. Springer, Berlin, pp 291–300
Chapter Google Scholar
Van Gestel T, Baesens B, Suykens JAK, Van den Poel D, Baestaens DE, Willekens M (2006) Bayesian kernel based classification for financial distress detection. Eur J Oper Res 172:979–1003
Article MATH Google Scholar
Vapnik V (1995) The nature of statistical learning theory. Springer, New York
Book MATH Google Scholar
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Varetto F (1998) Genetic algorithms applications in the analysis of insolvency risk. J Bank Finance 22:1421–1439
Article Google Scholar
Wiginton JC (1980) A note on the comparison of logit and discriminant models of consumer credit behavior. J Financ Quant Anal 15:757–770
Article Google Scholar
Yao P, Wu C, Yang M (2009) Credit risk assessment model of commercial banks based on fuzzy neural network. In: Proceedings of the sixth international symposium on neural networks
Yap BW, Ong SH, Husain N (2011) Using data mining to improve assessment of credit worthiness via credit scoring models. Expert Syst Appl 38:13274–13283
Yoon JS, Kwon YS (2010) A practical approach to bankruptcy prediction for small businesses: substituting the unavailable financial data for credit card sales information. Expert Syst Appl 37:3624–3629
Article Google Scholar
Zhang K, Lan L, Wang Z, Moerchen F (2012) Scaling up kernel svm on limited resources: a low-rank linearization approach. J Mach Learn Res Proc Track 22:1425–1434
Zhou X, Shi W, Tian Y (2011) Genetic algorithms applications in the analysis of insolvency risk. Expert Syst Appl 38:4272–4279
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Economics Management and Statistics (DEMS), Università degli Studi di Milano-Bicocca, Piazza dell’Ateneo Nuovo n.1, 20126, Milan, Italy
Caterina Liberati
Department of Statistical Sciences, Università di Bologna, Bologna, Italy
Furio Camillo
CEDRIC-CNAM, Paris, France
Gilbert Saporta

Authors

Caterina Liberati
View author publications
You can also search for this author in PubMed Google Scholar
Furio Camillo
View author publications
You can also search for this author in PubMed Google Scholar
Gilbert Saporta
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Caterina Liberati.

Appendix: Results of characterization of classification

Table 7 Categories characterizing the group of the bad instances classified as good

Full size table

Table 8 Categories characterizing the group of the bad instances classified as bad

Full size table

Table 9 Categories characterizing the group of the good instances classified as bad

Full size table

Table 10 Categories characterizing the group of the good instances classified as good

Full size table

Characterization of the test partition has been carried out by finding a ranking among all the characterizing variables of a group by means of probabilistic criterion: value-test (Lebart et al. 1984). More specifically, absolute values of such test are the simple measures of similarities among groups and variables. Therefore, a category of a variable can be considered characteristic of a group if its presence is significantly higher respect to what we expected, given its presence in the sample. The value-test is distributed as an hypergeometric but can be easily approximated to a standardized normal applying the Laplace–Gauss approximation. Thus, in formula is:

$$\begin{aligned} t_{q}(N)=\frac{N-E(N)}{s_{q}(N)} \end{aligned}$$

(12)

where $N \sim Hyp(n,n_{\nu }, n_{q})$, $E(N)=n_{q}\frac{n_{\nu }}{n}$ and $s^{2}_{q}=n_{q} \frac{n-n_{q}}{n-1} \frac{n_{\nu }}{n} (1-\frac{n_{\nu }}{n})$, $n_{q}$ is the number of instances sampled without replacement belonging to qth group and $n_{\nu }$ is the number of instances with $\nu $th category.

Tables 7, 8, 9 and 10 aid the interpretation of the test classification obtained by means of the reconstructed Cauchy kernel discriminant. The first column of each table collects the characteristic categories, the second shows the percentages of instances with $\nu $th category in the group q ($n_{\nu q}/n_{q}$, where $n_{\nu q}$ is the number of instances with $\nu $th category among those belonging to the class q), the third, the percentages of instances with $\nu $th category in the test set ($n_{\nu }/n$), the forth column shows the percentages of qth group with $\nu $th category ($n_{\nu q}/n_{\nu }$), the fifth and the sixth columns collect the value-test and probability values respectively. Such measures synthesize the homogeneity and the selectivity of the partition.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liberati, C., Camillo, F. & Saporta, G. Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis. Adv Data Anal Classif 11, 121–138 (2017). https://doi.org/10.1007/s11634-015-0213-y

Download citation

Received: 15 March 2013
Revised: 03 February 2015
Accepted: 18 June 2015
Published: 04 July 2015
Issue Date: March 2017
DOI: https://doi.org/10.1007/s11634-015-0213-y

Keywords

Mathematics Subject Classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Feature Selection for Credit Risk Classification

An Ensemble Wrapper Feature Selection for Credit Scoring

Credit Scoring Based on Kernel Matching Pursuit

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Results of characterization of classification

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classification

Navigation

Advances in credit scoring: combining performance and interpretation in kernel discriminant analysis

Abstract

Access this article

Similar content being viewed by others

Feature Selection for Credit Risk Classification

An Ensemble Wrapper Feature Selection for Credit Scoring

Credit Scoring Based on Kernel Matching Pursuit

Notes

References

Author information

Authors and Affiliations

Corresponding author

Appendix: Results of characterization of classification

Appendix: Results of characterization of classification

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation