Abstract
As the ensemble methods achieve significantly better performances than individual models do, they have been widely applied to credit scoring. However, most of them employ a static combiner to combine base classifiers, which do not consider the base classifiers’ characters and their dynamic classification ability. Though some dynamic ensemble methods are proposed, they need to produce a large number of base classifiers or employ a fixed combiner, which limit the generality of the ensemble methods. In this paper, we propose a new dynamic weighted ensemble method for credit scoring. Markov Chain is employed to model the change of each classifier’s classification ability and build a dynamic weighted trainable combiner, which dynamically assign weights to the base classifiers for each sample in the testing set. Through eight credit data sets from the real world, the experimental study demonstrates the ability and efficiency of the dynamic weighted ensemble method to improve prediction performance against the benchmark models, including some well-known individual classifiers and dynamic ensemble methods. Moreover, the proposed method can effectively decrease the misclassification cost, which can reduce risks for the financial institutions.
Similar content being viewed by others
References
Lin WY, Hu YH, Tsai CF (2012) Machine learning in financial crisis prediction: a survey. IEEE T Syst Man Cy C 42(4):421–436. https://doi.org/10.1109/tsmcc.2011.2170420
Bahrammirzaee A, Ghatari AR, Ahmadi P, Madani K (2011) Hybrid credit ranking intelligent system using expert system and artificial neural networks. Appl Intell 34(1):28–46. https://doi.org/10.1007/s10489-009-0177-8
BCBS (2011) Basel III: a global regulatory framework for more resilient banks and banking systems. Bank for International Settlements, Basel
Lessmann S, Baesens B, Seow HV, Thomas LC (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247(1):124–136
Avery RB, Calem PS, Canner GB (2004) Consumer credit scoring: do situational circumstances matter? J Banking Finance 28(4):835–856. https://doi.org/10.1016/j.jbankfin.2003.10.009
Zhou ZH (2008) Knowledge acquisition via ensemble learning. In: 2008 international forum on knowledge technology, pp 361–362
Polikar R (2012) Ensemble learning. Springer, US
Zhang CX, Duin RPT (2009) An empirical study of a linear regression combiner on multi-class data sets. In: Benediktsson, JA, Kittler, J, Roli, F (edn). Multiple classifier systems, proceedings, vol 5519. Lecture Notes in Computer Science, pp 478–487
Zhang ZL, Luo XG, Garcia S, Tang JF, Herrera F (2017) Exploring the effectiveness of dynamic ensemble selection in the one-versus-one scheme. Knowl-Based Syst 125:53–63
Zhu Y Q, Ou J S, Chen G, Yu H P (2011) Dynamic weighting ensemble classifiers based on cross-validation. Neural Comput Appl 20(3):309–317
Crook J N, Edelman D B, Thomas L C (2007) Recent developments in consumer credit risk assessment. Eur J Oper Res 183(3):1447–1465. https://doi.org/10.1016/j.ejor.2006.09.100
Beque A, Coussement K, Gayler R, Lessmann S (2017) Approaches for credit scorecard calibration: an empirical analysis. Knowl-Based Syst 134:213–227. https://doi.org/10.1016/j.knosys.2017.07.034
Dietterich T G (2000) An experimental comparison of three methods for constructing ensembles of decision trees: bagging, boosting, and randomization. Mach Learn 40(2):139–157. https://doi.org/10.1023/a:1007607513941
Fisher R A (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Hand D J, Henley W E (1997) Statistical classification methods in consumer credit scoring: a review. J Royal Stat Soc Ser A (Statistics in Society) 160:523–541
Marques A, García V, Sanchez J (2012) A literature review on the application of evolutionary computing to credit scoring. J Oper Res Soc 64(9):1384–1399
Tsai C -F, Chen M -L (2010) Credit rating by hybrid machine learning techniques. Appl Soft Comput 10 (2):374–380
Qian B, Rasheed K (2010) Foreign exchange market prediction with multiple classifiers. J Forecasting 29 (3):271–284. https://doi.org/10.1002/for.1124
Sun J, Li H (2012) Financial distress prediction using support vector machines: ensemble vs. individual. Appl Soft Comput 12(8):2254–2265
Chen N, Ribeiro B, Chen A (2016) Financial credit risk assessment: a recent review. Artif Intell Rev 45(1):1–23
Li H, Sun J (2013) Predicting business failure using an RSF-based case-based reasoning ensemble forecasting method. J Forecasting 32(2):180–192
Yu L A, Zhao Y, Tang L (2017) Ensemble forecasting for complex time series using sparse representation and neural networks. J Forecasting 36(2):122–138
Zhou L G, Lu D, Fujita H (2015) The performance of corporate financial distress prediction models with features selection guided by domain knowledge and data mining approaches. Knowl-Based Syst 85:52–61. https://doi.org/10.1016/j.knosys.2015.04.017
Zhang C X, Duin R P W (2009) An empirical study of a linear regression combiner on multi-class data sets. In: Proceedings of multiple classifier systems, international workshop, MCS, vol 2009. Reykjavik, Iceland, pp 478–487
Abellán J, Mantas C J (2014) Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst Appl 41(8):3825–3830
Ala’raj M, Abbod M F (2016) Classifiers consensus system approach for credit scoring. Knowl-Based Syst 104:89–105. https://doi.org/10.1016/j.knosys.2016.04.013
Kim E, Kim W, Lee Y (2003) Combination of multiple classifiers for the customer’s purchase behavior prediction. Decis Support Syst 34(2):167–175
Zhang C X, Duin R P W (2011) An experimental study of one- and two-level classifier fusion for different sample sizes. Pattern Recogn Lett 32(14):1756–1767
Duin RPW, Tax DMJ (1998) Classifier conditional posterior probabilities. In: Joint Iapr international workshops on advances in pattern recognition, pp 611–619
Ting K M, Witten I H (1999) Issues in stacked generalization. J Artif Intell Res 10:271–289
Kuncheva LI (2014) Combining pattern classifiers: methods and algorithms, 2nd edn
Yu L A, Yue W Y, Wang S Y, Lai K K (2010) Support vector machine based multiagent ensemble learning for credit risk evaluation. Expert Syst Appl 37(2):1351–1360
Jurek A, Bi Y X, Wu S L, Nugent C (2014) A survey of commonly used ensemble-based classification techniques. Knowl Eng Rev 29(5):551–581
Zhang L, Zhang L L, Teng W L, Chen Y B (2013) Based on information fusion technique with data mining in the application of finance early-warning. Procedia Comput Sci 17:695–703. https://doi.org/10.1016/j.procs.2013.05.090
Ko A H R, Sabourin R, Britto A S (2008) From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn 41(5):1718–1731
Woloszynski T, Kurzynski M, Podsiadlo P, Stachowiak G W (2012) A measure of competence based on random classification for dynamic ensemble selection. Inf Fusion 13(3):207–213. https://doi.org/10.1016/j.inffus.2011.03.007
Woloszynski T, Kurzynski M (2011) A probabilistic model of classifier competence for dynamic ensemble selection. Pattern Recogn 44(10–11):2656–2668. https://doi.org/10.1016/j.patcog.2011.03.020
Dos Santos E M, Sabourin R, Maupin P (2008) A dynamic overproduce-and-choose strategy for the selection of classifier ensembles. Pattern Recogn 41(10):2993–3009. https://doi.org/10.1016/j.patcog.2008.03.027
Sun J, Fujita H, Chen P, Li H (2017) Dynamic financial distress prediction with concept drift based on time weighting combined with adaboost support vector machine ensemble. Knowl-Based Syst 120:4–14. https://doi.org/10.1016/j.knosys.2016.12.019
Cinlar E (2015) Introduction to stochastic process. IEEE Trans Syst Man Cybern SMC 3(5):533–533
Jarrow R A, Lando D, Turnbull S M (1997) A Markov model for the term structure of credit risk spreads. Rev Financ Stud 10(2):481–523
Timofeeva G A F, Timofeev N (2012) Forecasting credit portfolio components with a Markov chain model. Autom Remote Control 73(4):637–651
Liu K, Lai KK, Guu S-M (2009) Dynamic credit scoring on consumer behavior using fuzzy Markov model. In: Fourth international multi-conference on computing in the global information technology, 2009. ICCGI’09. IEEE, pp 235–239
Fung E S, Siu T K (2012) A flexible Markov chain approach for multivariate credit ratings. Comput Econ 39(2):135–143
Chen Y -K (2007) Economic design of variable sampling interval T 2 control charts—a hybrid Markov chain approach with genetic algorithms. Expert Syst Appl 33(3):683–689
Sousa M R, Gama J, Brandao E (2016) Dynamic credit score modeling with short-term and long-term memories: the case of Freddie Mac’s database. J Risk Model Validat 10(1):59–80
So M M C, Thomas L C (2011) Modelling the profitability of credit cards by Markov decision processes. Eur J Oper Res 212(1):123–130
Lipton A, Rennie A, Bielelcki T R, Crépey S, Herbertsson A (2012) Markov Chain models of portfolio credit risk. The Oxford Handbook of Credit Derivatives, Oxford. https://doi.org/10.1093/oxfordhb/9780199546787.013.0010
Abdou H, Pointon J, El-Masry A (2008) Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Syst Appl 35(3):1275–1292. https://doi.org/10.1016/j.eswa.2007.08.030
Vapnik V N (1995) The nature of statistical learning theory. IEEE Trans Neural Netw 8(6):988–999
Zhou L G, Si Y W, Fujita H (2017) Predicting the listing statuses of Chinese-listed companies using decision trees combined with an improved filter feature selection method. Knowl-Based Syst 128:93–101. https://doi.org/10.1016/j.knosys.2017.05.003
Xu W, Xiao Z, Dang X, Yang D L, Yang X L (2014) Financial ratio selection for business failure prediction using soft set theory. Knowl-Based Syst 63:59–67. https://doi.org/10.1016/j.knosys.2014.03.007
Xu W, Xiao Z, Yang D L, Yang X L (2015) A novel nonlinear integrated forecasting model of logistic regression and support vector machine for business failure prediction with all sample sizes. J Test Eval 43(3):13. https://doi.org/10.1520/jte20130297
UCI Machine Learning Repository (2013) University of California, School of Information and Computer Science. http://archive.ics.uci.edu/ml
Thomas L C, Crook J, Edelman D (2002), Credit scoring and its applications. SIAM
Yeh I C, Lien C H (2009) The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Syst Appl 36(2):2473–2480
Xiao H S, Xiao Z, Wang Y (2016) Ensemble classification based on supervised clustering for credit scoring. Appl Soft Comput 43:73–86. https://doi.org/10.1016/j.asoc.2016.02.022
Calabrese R, Osmetti S A (2015) Improving forecast of binary rare events data: a GAM-based approach. J Forecasting 34(3):230– 239
Akkoc S (2012) An empirical comparison of conventional techniques, neural networks and the three stage hybrid adaptive neuro fuzzy inference system (ANFIS) model for credit scoring analysis: the case of Turkish credit card data. Eur J Oper Res 222(1):168–178. https://doi.org/10.1016/j.ejor.2012.04.009
Teng G -E, He C -Z, Xiao J, Jiang X -Y (2013) Customer credit scoring based on HMM/GMDH hybrid model. Knowl Inf Syst 36(3):731–747
Hand D J (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77(1):103–123
Hand D J, Anagnostopoulos C (2013) When is the area under the receiver operating characteristic curve an appropriate measure of classifier performance? Pattern Recogn Lett 34(5):492–495
Garcia V, Marques A I, Sanchez J S (2015) An insight into the experimental design for credit risk and corporate bankruptcy prediction systems. J Intell Inf Syst 44(1):159–189
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Acknowledgments
We thank the editor and the referees for their constructive remarks that helped to improve the clarity and the completeness of this paper. The work was supported by the National Natural Science Foundation of China [grant numbers 71671019, 71701116]; and MOE (Ministry of Education in China) Project of Humanities and Social Sciences [grant number 15YJC630016].
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Feng, X., Xiao, Z., Zhong, B. et al. Dynamic weighted ensemble classification for credit scoring using Markov Chain. Appl Intell 49, 555–568 (2019). https://doi.org/10.1007/s10489-018-1253-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10489-018-1253-8