Skip to main content
Log in

Sparse multi-criteria optimization classifier for credit risk evaluation

  • Methodologies and Application
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Over the past few decades, many classifier methods are suggested for credit risk evaluation. With ever-increasing amounts of data, for multi-criteria optimization classifier (MCOC) and other traditional classification methods, owing to the correlation among different features in data these classifiers often give the poor predictive performance. Thus, some dimensionality reduction techniques are firstly used to find important features; then, these classifier models are built on the reduced data set. However, because feature selection and classification are carried out in different feature spaces, the purpose of increasing predictive accuracy and interpretability is difficult to achieve truly. It is therefore important to research the new classifier methods with simultaneous classification and feature selection so as to improve the predictive accuracy and obtain the interpretable results. In this paper, we propose a novel sparse multi-criteria optimization classifier (SMCOC) based on one-norm regularization, linear and nonlinear programming, respectively, and construct the corresponding algorithm. The experimental results of credit risk evaluation and the comparison with linear and quadratic MCOCs, logistic regression and support vector machines have shown that the proposed SMCOC can enhance the separation of different credit applicants, the efficiency of credit scoring, the interpretability of risk evaluation model and the generalization power of risk prediction for new credit applicants.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Alpaydin E (2010) Introduction to machine learning, 2nd edn. MIT Press, London

    MATH  Google Scholar 

  • Baesens B, Egmont-Petersen M, Castelo R, Vanthienen J (2002) Learning Bayesian network classifiers for credit scoring using markov chain Monte Carlo search. In: 16th international conference on pattern recognition (ICPR’02), vol 3, pp 49–52

  • Bastos J (2008) Credit scoring with boosted decision trees. Online at http://mpra.ub.uni-muenchen.de/8034/ MPRA Paper No. 8034

  • Bekhet HA, Eletter SFK (2014) Credit risk assessment model for Jordanian commercial banks: neural scoring approach. Rev Dev Finance 4(1):20–28

    Article  Google Scholar 

  • Bellotti T, Crook J (2009) Support vector machines for credit scoring and discovery of significant features. Expert Syst Appl 36(2):3302–3308

    Article  Google Scholar 

  • Bolton C (2009) Logistic regression and its application in credit scoring. Dissertation, University of Pretoria

  • Capotorti A, Barbanera E (2012) Credit scoring analysis using a fuzzy probabilistic rough set model. Comput Stat Data Anal 56(4):981–994

    Article  MathSciNet  MATH  Google Scholar 

  • Chen R, Zhang Z, Wu D, Zhang P, Zhang X, Wang Y, Shi Y (2011) Prediction of protein interaction hot spots using rough set-based multiple criteria linear programming. J Theor Biol 269:174–180

    Article  MATH  Google Scholar 

  • Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297

    MATH  Google Scholar 

  • Cristianini N, Shawe-Taylor J (2000) An introduction to support vector machines and other kernel-based learning methods. Cambridge University Press, Cambridge

    Book  MATH  Google Scholar 

  • Danenas P, Garsva G (2015) Selection of support vector machines based classifiers for credit risk domain. Expert Syst Appl 42(6):3194–3204

    Article  Google Scholar 

  • Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Stat 32:407–499

    Article  MathSciNet  MATH  Google Scholar 

  • Fawcett T (2006) An introduction to ROC analysis. Pattern Recogn Lett 27:861–874

    Article  Google Scholar 

  • Freed N, Glover F (1981) Simple but powerful goal programming models for discriminant problems. Eur J Oper Res 7:44–60

    Article  MATH  Google Scholar 

  • Gao G, Zhang Z (2016) Prediction of Chinese word-formation patterns using the layer-weighted semantic graph-based KFP-MCO classifier. Comput Speech Lang 39:29–46

    Article  Google Scholar 

  • Gestel TV, Baesens B, Garcia J, Dijcke PV (2003) A support vector machine approach to credit scoring. Bank en Financiewezen 2:73–82

    Google Scholar 

  • Glover F (1990) Improved linear programming models for discriminant analysis. Decis Sci 21:771–785

    Article  Google Scholar 

  • Guyon I, Elissee A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Hamel L (2009) Knowledge discovery with support vector machines. Wiley, Hoboken

    Book  Google Scholar 

  • Henley WE, Hand DJ (1996) A k-nearest-neighbor classifier for assessing consumer credit risk. Stat 1:77–95

    Google Scholar 

  • Hosmer DW Jr, Lemeshow S, Sturdivant Rodney X (2013) Applied logistic regression. Wiley, New York

    Book  MATH  Google Scholar 

  • Huang C, Chen M, Wang C (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33:847–856

    Article  Google Scholar 

  • Hussein AA (2009) Genetic programming for credit scoring: the case of Egyptian public sector banks. Expert Syst Appl 36(9):11402–11417

    Article  Google Scholar 

  • Jensen HL (1992) Using neural networks for credit scoring. Managerial Finance 18(6):15–26

    Article  Google Scholar 

  • Kleinbaum DG, Klein M (2010) Logistic regression: a self-learning text. Springer, New York

    Book  MATH  Google Scholar 

  • Lahsasna A, Ainon RN, The YW (2010) Credit scoring models using soft computing methods: a survey. Int Arab J Inf Technol 7(2):115–123

    Google Scholar 

  • Lando D (2004) Credit risk modeling: theory and applications. Princeton University Press, Princeton

    Google Scholar 

  • Liu H, Motoda H (2008) Computational methods of feature selection. CRC Press, Boca Raton

    MATH  Google Scholar 

  • Liu F, Hua Z, Lim A (2015) Identifying future defaulters: a hierarchical Bayesian method. Eur J Oper Res 241(1):202–211

    Article  MathSciNet  MATH  Google Scholar 

  • Marinakis Y, Marinaki M, Doumpos M, Matsatsinis N, Zopounidis C (2008) Optimization of nearest neighbor classifiers via metaheuristic algorithms for credit risk assessment. J Glob Optim 42(2):279–293

    Article  MathSciNet  MATH  Google Scholar 

  • Martens D, Baesens B, Van Gestel T, Vanthienen J (2007) Comprehensible credit scoring models using rule extraction from support vector machines. Eur J Oper Res 183:1466–1476

    Article  MATH  Google Scholar 

  • Nebojsa N, Nevenka Z, Djordje S, Iva J (2013) The application of brute force logistic regression to corporate credit scoring models: evidence from Serbian financial statements. Expert Syst Appl 40(15):5932–5944

    Article  Google Scholar 

  • Ong C-S, Huang J-J, Tzeng G-H (2005) Building credit scoring models using genetic programming. Expert Syst Appl 29(1):41–47

    Article  Google Scholar 

  • Oreski S, Oreski G (2014) Genetic algorithm-based heuristic for feature selection in credit risk assessment. Expert Syst Appl 41(4):2052–2064

    Article  Google Scholar 

  • Pavlenko T, Chernyak O (2010) Credit risk modeling using Bayesian networks. Int J Intell Syst 25(4):326–344

    MATH  Google Scholar 

  • Raymond A (2007) The credit scoring toolkit: theory and practice for retail credit risk management and decision automation. Oxford University Press, Oxford

    Google Scholar 

  • Robert T (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B 58(1):267–288

    MathSciNet  MATH  Google Scholar 

  • Schebesch KB, Stecking R (2005) Support vector machines for credit scoring: extension to non standard cases. In: Innovations in classification, data science, and information systems, pp 498–505

  • Shi Y (2010) Multiple criteria optimization based data mining methods and applications: a systematic survey. Knowl Inf Syst 24(3):369–391

    Article  MathSciNet  Google Scholar 

  • Shi Y, Wise M, Luo M, Lin Y (2001) Data mining in credit card Portfolio management: a multiple criteria decision making approach. In: Koksalan M, Zionts S (eds) Advance in multiple criteria decision making in the New Millennium. Springer, Berlin, pp 427–436

    Chapter  Google Scholar 

  • Shigeo A (2010) Support vector machines for pattern classification, 2nd edn. Springer, Berlin

    MATH  Google Scholar 

  • Sohn SY, Kim JW (2012) Decision tree-based technology credit scoring for start-up firms: Korean case. Expert Syst Appl 39(4):4007–4012

    Article  Google Scholar 

  • Stanczyk U, Jain LC (2015) Feature selection for data and pattern recognition. Springer, Berlin

    Book  MATH  Google Scholar 

  • Steven F (2010) Credit scoring, response modelling and insurance rating: a practical guide to forecasting consumer behaviour. Palgrave Macmillan, Basingstoke

    Google Scholar 

  • Thomas LC, Crook J, Edelman D (2002) Credit scoring and its applications. Society for Industrial Mathematics, Philadelphia

    Book  MATH  Google Scholar 

  • Tibshirani R, Saunders M, Rosset S, Zhu J, Knight K (2005) Sparsity and smoothness via the fused lasso. J R Stat Soc Ser B 67:91–108

    Article  MathSciNet  MATH  Google Scholar 

  • Vapnik VN (1998) Statistic learning theory. Wiley, New York

    MATH  Google Scholar 

  • West D (2000) Neural network credit scoring models. Comput Oper Res 27:1131–1152

    Article  MATH  Google Scholar 

  • Wiginton JC (1980) A note on the comparison of logit and discriminant models of consumer credit behaviour. J Financ Quant Anal 15:757–770

    Article  Google Scholar 

  • Zhang Z, Shi Y, Gao G (2009) A rough set-based multiple criteria linear programming approach for the medical diagnosis and prognosis. Expert Syst Appl 36(5):8932–8937

    Article  Google Scholar 

  • Zhang D, Zhou X, Leung SCH, Zheng J (2010) Vertical bagging decision trees model for credit scoring. Expert Syst Appl 37(12):7838–7843

    Article  Google Scholar 

  • Zhang Z, Gao G, Yue J, Duan Y, Shi Y (2014a) Multi-criteria optimization classifier using fuzzification, kernel and penalty factors for predicting protein interaction hot spots. Appl Soft Comput 18:115–125

    Article  Google Scholar 

  • Zhang Z, Gao G, Shi Y (2014b) Credit risk evaluation using multi-criteria optimization classifier with kernel, fuzzification and penalty factors. Eur J Oper Res 237:335–348

    Article  MathSciNet  MATH  Google Scholar 

  • Zhang Z, Gao G, Tian Y (2015) Multi-kernel multi-criteria optimization classifier with fuzzification and penalty factors for predicting biological activity. Knowl Based Syst 89:301–313

    Article  Google Scholar 

  • Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Ser B 67:301–320

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the anonymous reviewers for their valuable comments and suggestions. This research has been partially supported by the Science Foundation of Ludong University (LY2010013) and the Natural Science Foundation of Shandong (ZR2012FL13, ZR2016FM15).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhiwang Zhang.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Ethical approval

The article does not contain any studies with human participants or animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Communicated by V. Loia.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Z., He, J., Gao, G. et al. Sparse multi-criteria optimization classifier for credit risk evaluation. Soft Comput 23, 3053–3066 (2019). https://doi.org/10.1007/s00500-017-2953-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-017-2953-4

Keywords

Navigation