Abstract
With the tremendous development of financial institutions, credit risk prediction (CRP) plays an essential role in granting loans to customers and helps them to minimize their loss because credit approval sometimes results in massive financial loss. So extra attention is needed to identify risky customer. Researchers have designed complex CRP models using artificial intelligence (AI) and statistical techniques to support the financial institutions to take correct business decisions. Though there are various statistical and AI methods available, the recent literature shows that the ensemble-based CRP model provides improved prediction results than single classifier system. The small increase in the performance of CRP model could result in a significant improvement in the profit of financial institutions and banks. This work proposes a weight-adjusted boosting ensemble method (WABEM) using rough set (RS)-based feature selection (FS) technique with the balancing and regression-based preprocessing called RS\(\_\)RFS-WABEM. Regression is used to fill missing value in the records to improve the performance of CRP. Three credit datasets (Australia, German and Japanese) are chosen to validate the feasibility and effectiveness of the proposed ensemble method. The trade-off between the uncertainty and imprecise probability of the proposed classifier model is evaluated using the performance measures such as accuracy and area under the curve. Experimental results show that the proposed ensemble method performs better than other base and ensemble classifier methods.
Similar content being viewed by others
References
Abellán J, Mantas CJ (2014) Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst Appl 41(8):3825–3830
Abellán J, Masegosa AR (2012) Bagging schemes on the presence of class noise in classification. Expert Syst Appl 39(8):6827–6837
Alfaro E, García N, Gámez M, Elizondo D (2008) Bankruptcy forecasting: an empirical comparison of adaboost and neural networks. Decis Support Syst 45(1):110–122
Antunes F, Ribeiro B, Pereira F (2017) Probabilistic modeling and visualization for bankruptcy prediction. Appl Soft Comput 60:831–843
Baesens B, Van Gestel T, Viaene S, Stepanova M, Suykens J, Vanthienen J (2003) Benchmarking state-of-the-art classification algorithms for credit scoring. J Oper Res Soc 54(6):627–635
Bequé A, Lessmann S (2017) Extreme learning machines for credit scoring: an empirical evaluation. Expert Syst Appl 86:42–53
Bian S, Wang W (2007) On diversity and accuracy of homogeneous and heterogeneous ensembles. Int J Hybrid Intell Syst 4(2):103–128
Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1):245–271
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
Chen Y-C (2001) A study on the quality of credit granting in leasing: fuzzy set theory approach. Soft Comput 5(3):229–236
Dataset 1. http://archive.ics.uci.edu/ml/datasets/Japanese+Credit+Screening
Dataset 2. https://archive.ics.uci.edu/ml/datasets/Credit+Approval
Dataset 3. https://archive.ics.uci.edu/ml/datasets/Statlog+(German+Credit+Data)
Deligianni D, Kotsiantis S (2012) Forecasting corporate bankruptcy with an ensemble of classifiers. In: Artificial intelligence: theories and applications. Springer, pp 65–72
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7(Jan):1–30
Desai VS, Crook JN, Overstreet GA (1996) A comparison of neural networks and linear scoring models in the credit union environment. Eur J Oper Res 95(1):24–37
Ditterrich TG (1997) Machine learning research: four current direction. Artif Intell Mag 4:97–136
Fazayeli F, Wang L, Mandziuk J (2008) Feature selection based on the rough set theory and EM clustering algorithm
Finlay S (2011) Multiple classifier architectures and their application to credit risk assessment. Eur J Oper Res 210(2):368–378
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19:1–67
Han J, Pei J, Kamber M (2011) Data mining: concepts and techniques. Elsevier, Amsterdam
Ho TK (1998) The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell 20(8):832–844
Hsieh N-C, Hung L-P (2010) A data driven ensemble classifier for credit scoring analysis. Expert Syst Appl 37(1):534–545
Huang C-L, Chen M-C, Wang C-J (2007) Credit scoring with a data mining approach based on support vector machines. Expert Syst Appl 33(4):847–856
Jiang Y (2009) Credit scoring model based on the decision tree and the simulated annealing algorithm. In: WRI world congress on computer science and information engineering, 2009, vol 4. IEEE, pp 18–22
Karels GV, Prakash AJ (1987) Multivariate normality and forecasting of business bankruptcy. J Bus Finance Account 14(4):573–593
Kim E, Kim W, Lee Y (2003) Combination of multiple classifiers for the customer’s purchase behavior prediction. Decis Support Syst 34(2):167–175
Kuncheva LI (2004) Combining pattern classifiers: methods and algorithms. Wiley, London
Lean Y, Yao X (2013) A total least squares proximal support vector classifier for credit risk evaluation. Soft Comput 17(4):643–650
Lean Y, Wang S, Lai KK (2008) Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst Appl 34(2):1434–1444
Lessmann S, Baesens B, Seow H-V, Thomas LC (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247(1):124–136
Liang D, Tsai C-F, Hsin-Ting W (2015) The effect of feature selection on financial distress prediction. Knowl Based Syst 73:289–297
Lin W-Y, Ya-Han H, Tsai C-F (2012) Machine learning in financial crisis prediction: a survey. IEEE Trans Syst Man Cybern Part C Appl Rev 42(4):421–436
Liu H, Motoda H (2012) Feature selection for knowledge discovery and data mining, vol 454. Springer, Berlin
Marqués AI, García V, Sánchez JS (2012) Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Syst Appl 39(11):10244–10250
Nanni L, Lumini A (2009) An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Syst Appl 36(2):3028–3033
Pal R, Kupka K, Aneja AP, Militky J (2016) Business health characterization: a hybrid regression and support vector machine analysis. Expert Syst Appl 49:48–59
Pawlak Z (1982) Rough sets. Int J Parallel Program 11(5):341–356
Schebesch KB, Stecking R (2005) Support vector machines for classifying and describing credit applicants: detecting typical and critical regions. J Oper Res Soc 56(9):1082–1088
Shin K, Han I (2001) A case-based approach using inductive indexing for corporate bond rating. Decis Support Syst 32(1):41–52
Sivasankar E, Selvi C, Mala C (2017) A study of dimensionality reduction techniques with machine learning methods for credit risk prediction. In: Computational intelligence in data mining. Springer, pp 65–76
Sun J, Li H (2012) Financial distress prediction using support vector machines: ensemble vs. individual. Appl Soft Comput 12(8):2254–2265
Sun J, Li H, Huang Q-H, He K-Y (2014) Predicting financial distress and corporate failure: a review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl Based Syst 57:41–56
Tam KY, Kiang MY (1992) Managerial applications of neural networks: the case of bank failure predictions. Manag Sci 38(7):926–947
Thomas LC (2000) A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int J Forecast 16(2):149–172
Tsai C-F (2014) Combining cluster analysis with classifier ensembles to predict financial distress. Inf Fusion 16:46–58
Tsai C-F, Jhen-Wei W (2008) Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst Appl 34(4):2639–2649
Tsai C-F, Hsu Y-F, Yen DC (2014) A comparative study of classifier ensembles for bankruptcy prediction. Appl Soft Comput 24:977–984
Verikas A, Kalsyte Z, Bacauskiene M, Gelzinis A (2010) Hybrid and ensemble-based soft computing techniques in bankruptcy prediction: a survey. Soft Comput 14(9):995–1010
Wang G, Hao J, Ma J, Jiang H (2011) A comparative assessment of ensemble learning for credit scoring. Expert Syst Appl 38(1):223–230
Wang G, Ma J, Yang S (2014) An improved boosting based on feature selection for corporate bankruptcy prediction. Expert Syst Appl 41(5):2353–2361
West D (2000) Neural network credit scoring models. Comput Oper Res 27(11):1131–1152
Xiao J, Xie L, He C, Jiang X (2012) Dynamic classifier ensemble model for customer classification with imbalanced class distribution. Expert Syst Appl 39(3):3668–3675
Zhang Z, He J, Gao G, Tian Y (2019) Sparse multi-criteria optimization classifier for credit risk evaluation. Soft Comput 23(9):3053–3066
Zhou L (2013) Performance of corporate bankruptcy prediction models on imbalanced dataset: the effect of sampling methods. Knowl Based Syst 41:16–25
Zhou L, Lai KK, Lean Y (2009) Credit scoring using support vector machines with direct search for parameters selection. Soft Comput Fusion Found Methodol Appl 13(2):149–155
Zhou L, Lai KK, Yen J (2014) Bankruptcy prediction using svm models with a new approach to combine features selection and parameter optimisation. Int J Syst Sci 45(3):241–253
Zhou L, Dong L, Fujita H (2015) The performance of corporate financial distress prediction models with features selection guided by domain knowledge and data mining approaches. Knowl Based Syst 85:52–61
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
We declare that we have no conflict of interest.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sivasankar, E., Selvi, C. & Mahalakshmi, S. Rough set-based feature selection for credit risk prediction using weight-adjusted boosting ensemble method. Soft Comput 24, 3975–3988 (2020). https://doi.org/10.1007/s00500-019-04167-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04167-0