High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring

Hayashi, Yoichi; Oishi, Tatsuhiro

doi:10.1007/s00354-018-0043-5

High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring

Research Paper
Published: 13 August 2018

Volume 36, pages 393–418, (2018)
Cite this article

New Generation Computing Aims and scope Submit manuscript

717 Accesses
14 Citations
1 Altmetric
Explore all metrics

Abstract

Accuracy and interpretability are two perspectives that are difficult to balance; this is referred to as the accuracy-interpretability dilemma. If credit models gain interpretability, they lose accuracy, and vice versa. Researchers continue to develop an array of very complicated predictive models; however, the finance industry needs interpretable models that can be used in actual practice. Especially, advanced sequential ensembles are seldom considered in credit scoring. Therefore, it is worthwhile to explore new rule extraction methods capable of building sequential ensemble classifiers that are effective for credit scoring. To enhance the accuracy and interpretability of extracted rules, we extend continuous recursive-rule extraction (continuous Re-RX) to a high accuracy-priority rule extraction method referred to as continuous Re-RX with J48graft. Continuous Re-RX with J48graft uses a recursive approach called subdivision. This approach consists of a backpropagation neural network, pruning, and a J48graft decision tree for mixed datasets (those containing discrete and continuous attributes) to construct a high accuracy-priority rule extraction method. Compared with previous rule extraction methods for Australian- and German-based datasets, continuous Re-RX with J48graft achieved the highest accuracies, 88.4 and 79.0%, respectively, using tenfold cross validation (CV) and the Friedman and Bonferroni–Dunn tests, and 87.82 and 78.4%, respectively, using 10 runs of tenfold CV, with the best Friedman score. We also demonstrate how continuous Re-RX with J48graft overcomes the accuracy-interpretability dilemma based on its performance. We believe that continuous Re-RX with J48graft can help overcome the accuracy-interpretability dilemma for transparency of Big Data in financial situations and for industrial applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparative Study of Accuracies on the Family of the Recursive-Rule Extraction Algorithm

Interpretable Credit Scoring Model via Rule Ensemble

Credit scoring by leveraging an ensemble stochastic criterion in a transformed feature space

Article 30 May 2021

References

García, V., Marqués, A.I., Sánchez, J.S.: An insight into the experimental design for credit risk and corporate bankruptcy prediction systems. J. Intell. Inf. Syst. 44, 159–189 (2015)
Article Google Scholar
Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y., Wasinger, R.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42, 3508–3516 (2015)
Article Google Scholar
Hayashi, Y.: Application of a rule extraction algorithm family based on the Re-RX algorithm to financial credit risk assessment from a Pareto optimal perspective. Oper. Res. Perspect. 3, 32–42 (2016)
Article MathSciNet Google Scholar
Martens, D., Baesens, B., Gestel, T.V., Vanthienen, J.: Comprehensible credit scoring models using rule extraction from support vector machines. Eur. J. Oper. Res. 183, 1466–1476 (2007)
Article Google Scholar
Baesens, B., Setiono, R., Mues, C., Vanthienen, J.: Using neural network rule extraction and decision tables for credit-risk evaluation. Manag. Sci. 49, 312–329 (2003)
Article Google Scholar
Marqués, A.I., García, V., Sánchez, J.S.: Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Syst. Appl. 39, 10244–10250 (2012)
Article Google Scholar
Marqués, A.I., García, V., Sánchez, J.S.: Two-level classifier ensembles for credit risk assessment. Expert Syst. Appl. 39, 10916–10922 (2012)
Article Google Scholar
Abellán, J., Mantas, C.J.: Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 41, 3825–3830 (2014)
Article Google Scholar
Abellán, J., Castellano, J.G.: A comparative study on base classifiers in ensemble methods for credit scoring. Expert Syst. Appl. 73, 1–10 (2017)
Article Google Scholar
Gorzałczany, M.B., Rudziński, F.: A multi-objective genetic optimization for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability. Appl. Soft Comput. 40, 206–220 (2016)
Article Google Scholar
Atiya, A.F.: Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans. Neural Netw. 12, 929–935 (2001)
Article Google Scholar
Khashman, A.: A neural network model for credit risk evaluation. Int. J. Neural Syst. 19, 285–294 (2009)
Article Google Scholar
Serrano-Cinca, C.: Self organizing neural networks for financial diagnosis. Decis. Support Syst. 17, 227–238 (1996)
Article Google Scholar
Lee, Y.-C.: Application of support vector machines to corporate credit rating prediction. Expert Syst. Appl. 33, 67–74 (2007)
Article Google Scholar
Zhou, L., Lai, K.K., Yu, L.: Credit scoring using support vector machines with direct search for parameters selection. Soft. Comput. 13, 149–155 (2008)
Article Google Scholar
Yu, L., Yao, X.: A total least squares proximal support vector classifier for credit risk evaluation. Soft. Comput. 17, 643–650 (2013)
Article Google Scholar
Yu, L., Yao, X., Wang, S., Lai, K.K.: Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection. Expert Syst. Appl. 38, 15392–15399 (2011)
Article Google Scholar
Aguilar-Rivera, R., Valenzuela-Rendón, M., Rodríguez-Ortiz, J.J.: Genetic algorithms and Darwinian approaches in financial applications: a survey. Expert Syst. Appl. 42, 7684–7697 (2015)
Article Google Scholar
Ong, C.S., Huang, J.J., Tzeng, G.H.: Building credit scoring models using genetic programming. Expert Syst. Appl. 29, 41–47 (2005)
Article Google Scholar
Chang, S.-Y., Yeh, T.-Y.: An artificial immune classifier for credit scoring analysis. Appl. Soft Comput. 12, 611–618 (2012)
Article Google Scholar
Li, H., Sun, J., Sun, B.-L.: Financial distress prediction based on OR-CBR in the principle of k-nearest neighbors. Expert Syst. Appl. 36, 643–659 (2009)
Article Google Scholar
Kim, M.-J., Kang, D.-K.: Ensemble with neural networks for bankruptcy prediction. Expert Syst. Appl. 37, 3373–3379 (2010)
Article Google Scholar
Nanni, L., Lumini, A.: An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 36, 3028–3033 (2009)
Article Google Scholar
Tsai, C., Wu, J.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34, 2639–2649 (2008)
Article Google Scholar
Chen, H.-L., Yang, B., Wang, G., Liu, J., Xu, X., Wang, S.-J., Liu, D.-Y.: A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowl. Based Syst. 24, 1348–1359 (2011)
Article Google Scholar
Feldman, D., Gross, S.: Mortgage default: classification trees analysis. J. Real Estate Financ. Econ. 30, 369–396 (2005)
Article Google Scholar
Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21, 1–14 (2006)
Article MathSciNet Google Scholar
Sun, J., Li, H., Huang, Q.-H., He, K.-Y.: Predicting financial distress and corporate failure: a review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl. Based Syst. 57, 41–56 (2014)
Article Google Scholar
Chen, Y.-S., Cheng, C.-H.: Hybrid models based on rough set classifiers for setting credit rating decision rules in the global banking industry. Knowl. Based Syst. 39, 224–239 (2013)
Article Google Scholar
Finlay, S.: Multiple classifier architectures and their application to credit risk assessment. Eur. J. Oper. Res. 210, 368–378 (2011)
Article Google Scholar
Tomczak, J.M., Zięba, M.: Classification restricted Boltzmann machine for comprehensible credit scoring model. Expert Syst. Appl. 42, 1789–1796 (2015)
Article Google Scholar
Setiono, R., Baesens, B., Mues, C.: Recursive neural network rule extraction for data with mixed attributes. IEEE Trans. Neural Netw. 19, 299–307 (2008)
Article Google Scholar
Mues, C., Baesens, B., Files, C.M., Vanthienen, J.: Decision diagrams in machine learning: an empirical study on real-life credit-risk data. Expert Syst. Appl. 27, 257–264 (2004)
Article Google Scholar
Florez-Lopez, R., Ramon-Jeronimo, J.M.: Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Syst. Appl. 42, 5737–5753 (2015)
Article Google Scholar
Hsieh, N.-C., Hung, L.-P.: A data driven ensemble classifier for credit scoring analysis. Expert Syst. Appl. 37, 534–545 (2010)
Article Google Scholar
Andrews, R., Diederich, J., Tickle, A.: Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl. Based Syst. 8, 373–389 (1995)
Article Google Scholar
Biswas, S.K., Chakraborty, M., Purkayastha, B., Roy, P., Thounaojam, D.M.: Rule extraction from training data using neural network. Int. J. Artif. Intell. Tools (2017). https://doi.org/10.1142/S0218213017500063
Article Google Scholar
Biswas, S.K., Chakraborty, M., Purkayastha, B.: A rule generation algorithm from neural network using classified and misclassified data. Int. J. Bio-Inspir. Comput. 11, 60–70 (2018)
Article Google Scholar
Bologna, G., Hayashi, Y.: A comparison study on rule extraction from neural network ensembles, boosted shallow trees, and SVMs. Appl. Comput. Intell. Soft Comput. (2018). https://doi.org/10.1155/2018/4084850
Article Google Scholar
Fortuny, E.J.D., Martens, D.: Active learning-based pedagogical rule extraction. IEEE Trans. Neural Netw. Learn. Syst. 26, 2664–2677 (2015)
Article MathSciNet Google Scholar
Setiono, R.: A penalty-function approach for pruning feedforward neural networks. Neural Comput. 9, 185–204 (1997)
Article Google Scholar
Quinlan, J.R.: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)
Google Scholar
Hayashi, Y., Nakano, S., Fujisawa, S.: Use of the recursive-rule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease. Inf. Med. Unlock. 1, 1–8 (2015)
Article Google Scholar
Hayashi, Y., Fujisawa, S.: Strategic approach for multiple-MLP ensemble Re-RX algorithm. Proceedings of International Joint Conference on Neural Networks (IJCNN 2015), pp. 669–676. IEEE, Killeany (2015)
Google Scholar
Hayashi, Y., Tanaka, Y., Takagi, T., Saito, T., Iiduka, H., Kikuchi, H., Bologna, G., Mitra, S.: Recursive-rule extraction algorithm with J48graft and applications to generating credit scores. J. Artif. Intell. Soft Comput. Res. 6, 35–44 (2015)
Article Google Scholar
Hayashi, Y., Nakano, S.: Use of a recursive-rule extraction algorithm with J48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset. Inf. Med. Unlock. 1, 9–16 (2015)
Article Google Scholar
Hayashi, Y., Yukita, S.: Rule extraction using recursive-rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset. Inf. Med. Unlock. 2, 92–104 (2016)
Article Google Scholar
Chakraborty, M., Biswas, S.K., Purkayastha, B.: recursive rule extraction from NN using reverse engineering technique. New. Gener. Comput. 36, 119 (2018). https://doi.org/10.1007/s00354-018-0031-9
Article Google Scholar
Webb, G.I.: Decision tree grafting from the all-tests-but-one partition. Proceedings of the 16th International Joint Conference on Artificial Intelligence, pp. 702–707. Morgan Kaufmann, Nagoya (1999)
Google Scholar
Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)
MathSciNet MATH Google Scholar
Marqués, A.I., García, V., Sánchez, J.S.: On the suitability of resampling techniques for the class imbalance problem in credit scoring. J. Oper. Res. Soc. 64, 1060–1070 (2013)
Article Google Scholar
Mashayekhi, M., Gras, R.: Rule extraction from decision trees ensembles: new algorithms based on heuristic search and sparse group lasso methods. Int. J. Inf. Technol. Decis. Mak. 16, 1707–1727 (2017)
Article Google Scholar
Hayashi, Y., Fukunaga, K.: Accuracy of rule extraction using a recursive-rule extraction algorithm with continuous attributes combined with a sampling selection technique for the diagnosis of liver disease. Inf. Med. Unlock. 5, 26–38 (2016)
Article Google Scholar
Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989)
Article Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Mateo (1999)
MATH Google Scholar
Webb, G.I.: Decision tree grafting. Learning, IJCAI’97 Proceedings of 15th International Conference on Artificial Intelligence (IJCAI), pp. 846–885. Morgan Kaufmann, Nagoya (1997)
Google Scholar
Duin, R.P.W., Tax, D.M.J.: Experiments with classifier combining rules. International Workshop on Multiple Classifier Systems. Multiple Classifier Systems, pp. 16–29. Springer, Berlin (2000)
Chapter Google Scholar
Paleologo, G., Elisseeff, A., Antonini, G.: Subagging for credit scoring models. Eur. J. Oper. Res. 201, 490–499 (2010)
Article Google Scholar
Wang, G., Ma, J., Huang, L., Xu, K.: Two credit scoring models based on dual strategy ensemble trees. Knowl. Based Syst. 26, 61–68 (2012)
Article Google Scholar
Yeh, C.-C., Lin, F., Hsu, C.-Y.: A hybrid KMV model, random forests and rough set theory approach for credit rating. Knowl. Based Syst. 33, 166–172 (2012)
Article Google Scholar
Ala’raj, M., Abbod, M.F.: Classifiers consensus system approach for credit scoring. Knowl. Based Syst. 104, 89–105 (2016)
Article Google Scholar
Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning, vol. 96, pp. 148–156. Morgan Kaufmann, Nagoya (1996)
Google Scholar
Brown, I., Mues, C.: An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39, 3446–3453 (2012)
Article Google Scholar
Tsai, C.-F., Chen, M.-L.: Credit rating by hybrid machine learning techniques. Appl. Soft Comput. 10, 374–380 (2010)
Article Google Scholar
Xia, Y., Liu, C., Li, Y., Liu, N.: A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Syst. Appl. 78, 225–241 (2017)
Article Google Scholar
Hayashi, Y.: Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease. Knowl. Based Syst. 131, 170–182 (2017)
Article Google Scholar
Frank, A., Asuncion, A.: Irvine machine learning repository. http://archive.ics.uci.edu/ml/ (2010). Accessed 8 June 2017
Salzberg, S.L.: On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min. Knowl. Discov. 1, 317–328 (1997)
Article Google Scholar
Chen, N., Ribeiro, B., Chen, A.: Financial credit risk assessment: a recent review. Artif. Intell. Rev. 45, 1–23 (2016)
Article Google Scholar
Smith, M.: Neural Networks for Statistical Modeling. Van Nostrand Reinhold, New York (1993)
MATH Google Scholar
Huysmans, J., Setiono, R., Baesens, B., Vanthienen, J.: Minerva: sequential covering for rule extraction. IEEE Trans. Syst. Man Cybern. B Cybern. 38, 299–309 (2008)
Article Google Scholar
Setiono, R., Liu, H.: NeuroLinear: from neural networks to oblique decision rules. Neurocomputing 17, 1–24 (1997)
Article Google Scholar
Odajima, K., Hayashi, Y., Tianxia, G., Setiono, R.: Greedy rule generation from discrete data and its use in neural network rule extraction. Neural Netw. 21, 1020–1028 (2008)
Article Google Scholar
Bologna, G., Hayashi, Y.: QSVM: a support vector machine for rule extraction. In: Rojas, I., Joya, G., Catala, A. (eds.) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science, pp. 276–289. Springer, Cham (2015)
Google Scholar

Download references

Funding

This work was supported in part by the Japan Society for the Promotion of Science through a Grant-in-Aid for Scientific Research (C) (18K11481).

Author information

Authors and Affiliations

Department of Computer Science, Meiji University, Kawasaki, 214-8571, Japan
Yoichi Hayashi & Tatsuhiro Oishi

Authors

Yoichi Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Tatsuhiro Oishi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yoichi Hayashi.

Ethics declarations

Ethical approval

All authors and the responsible authorities where the work was conducted have approved the final manuscript and agree with submission to New Generation Computing.

Conflict of interest

The authors declare that they have no conflict of interest.

About this article

Cite this article

Hayashi, Y., Oishi, T. High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring. New Gener. Comput. 36, 393–418 (2018). https://doi.org/10.1007/s00354-018-0043-5

Download citation

Received: 16 March 2018
Accepted: 02 August 2018
Published: 13 August 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s00354-018-0043-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring

Abstract

Access this article

Similar content being viewed by others

Comparative Study of Accuracies on the Family of the Recursive-Rule Extraction Algorithm

Interpretable Credit Scoring Model via Rule Ensemble

Credit scoring by leveraging an ensemble stochastic criterion in a transformed feature space

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

About this article

Cite this article

Keywords

Navigation

High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring

Abstract

Access this article

Similar content being viewed by others

Comparative Study of Accuracies on the Family of the Recursive-Rule Extraction Algorithm

Interpretable Credit Scoring Model via Rule Ensemble

Credit scoring by leveraging an ensemble stochastic criterion in a transformed feature space

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Ethical approval

Conflict of interest

About this article

Cite this article

Share this article

Keywords

Search

Navigation