Skip to main content
Log in

High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring

  • Research Paper
  • Published:
New Generation Computing Aims and scope Submit manuscript

Abstract

Accuracy and interpretability are two perspectives that are difficult to balance; this is referred to as the accuracy-interpretability dilemma. If credit models gain interpretability, they lose accuracy, and vice versa. Researchers continue to develop an array of very complicated predictive models; however, the finance industry needs interpretable models that can be used in actual practice. Especially, advanced sequential ensembles are seldom considered in credit scoring. Therefore, it is worthwhile to explore new rule extraction methods capable of building sequential ensemble classifiers that are effective for credit scoring. To enhance the accuracy and interpretability of extracted rules, we extend continuous recursive-rule extraction (continuous Re-RX) to a high accuracy-priority rule extraction method referred to as continuous Re-RX with J48graft. Continuous Re-RX with J48graft uses a recursive approach called subdivision. This approach consists of a backpropagation neural network, pruning, and a J48graft decision tree for mixed datasets (those containing discrete and continuous attributes) to construct a high accuracy-priority rule extraction method. Compared with previous rule extraction methods for Australian- and German-based datasets, continuous Re-RX with J48graft achieved the highest accuracies, 88.4 and 79.0%, respectively, using tenfold cross validation (CV) and the Friedman and Bonferroni–Dunn tests, and 87.82 and 78.4%, respectively, using 10 runs of tenfold CV, with the best Friedman score. We also demonstrate how continuous Re-RX with J48graft overcomes the accuracy-interpretability dilemma based on its performance. We believe that continuous Re-RX with J48graft can help overcome the accuracy-interpretability dilemma for transparency of Big Data in financial situations and for industrial applications.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. García, V., Marqués, A.I., Sánchez, J.S.: An insight into the experimental design for credit risk and corporate bankruptcy prediction systems. J. Intell. Inf. Syst. 44, 159–189 (2015)

    Article  Google Scholar 

  2. Zhao, Z., Xu, S., Kang, B.H., Kabir, M.M.J., Liu, Y., Wasinger, R.: Investigation and improvement of multi-layer perceptron neural networks for credit scoring. Expert Syst. Appl. 42, 3508–3516 (2015)

    Article  Google Scholar 

  3. Hayashi, Y.: Application of a rule extraction algorithm family based on the Re-RX algorithm to financial credit risk assessment from a Pareto optimal perspective. Oper. Res. Perspect. 3, 32–42 (2016)

    Article  MathSciNet  Google Scholar 

  4. Martens, D., Baesens, B., Gestel, T.V., Vanthienen, J.: Comprehensible credit scoring models using rule extraction from support vector machines. Eur. J. Oper. Res. 183, 1466–1476 (2007)

    Article  Google Scholar 

  5. Baesens, B., Setiono, R., Mues, C., Vanthienen, J.: Using neural network rule extraction and decision tables for credit-risk evaluation. Manag. Sci. 49, 312–329 (2003)

    Article  Google Scholar 

  6. Marqués, A.I., García, V., Sánchez, J.S.: Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Syst. Appl. 39, 10244–10250 (2012)

    Article  Google Scholar 

  7. Marqués, A.I., García, V., Sánchez, J.S.: Two-level classifier ensembles for credit risk assessment. Expert Syst. Appl. 39, 10916–10922 (2012)

    Article  Google Scholar 

  8. Abellán, J., Mantas, C.J.: Improving experimental studies about ensembles of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 41, 3825–3830 (2014)

    Article  Google Scholar 

  9. Abellán, J., Castellano, J.G.: A comparative study on base classifiers in ensemble methods for credit scoring. Expert Syst. Appl. 73, 1–10 (2017)

    Article  Google Scholar 

  10. Gorzałczany, M.B., Rudziński, F.: A multi-objective genetic optimization for fast, fuzzy rule-based credit classification with balanced accuracy and interpretability. Appl. Soft Comput. 40, 206–220 (2016)

    Article  Google Scholar 

  11. Atiya, A.F.: Bankruptcy prediction for credit risk using neural networks: a survey and new results. IEEE Trans. Neural Netw. 12, 929–935 (2001)

    Article  Google Scholar 

  12. Khashman, A.: A neural network model for credit risk evaluation. Int. J. Neural Syst. 19, 285–294 (2009)

    Article  Google Scholar 

  13. Serrano-Cinca, C.: Self organizing neural networks for financial diagnosis. Decis. Support Syst. 17, 227–238 (1996)

    Article  Google Scholar 

  14. Lee, Y.-C.: Application of support vector machines to corporate credit rating prediction. Expert Syst. Appl. 33, 67–74 (2007)

    Article  Google Scholar 

  15. Zhou, L., Lai, K.K., Yu, L.: Credit scoring using support vector machines with direct search for parameters selection. Soft. Comput. 13, 149–155 (2008)

    Article  Google Scholar 

  16. Yu, L., Yao, X.: A total least squares proximal support vector classifier for credit risk evaluation. Soft. Comput. 17, 643–650 (2013)

    Article  Google Scholar 

  17. Yu, L., Yao, X., Wang, S., Lai, K.K.: Credit risk evaluation using a weighted least squares SVM classifier with design of experiment for parameter selection. Expert Syst. Appl. 38, 15392–15399 (2011)

    Article  Google Scholar 

  18. Aguilar-Rivera, R., Valenzuela-Rendón, M., Rodríguez-Ortiz, J.J.: Genetic algorithms and Darwinian approaches in financial applications: a survey. Expert Syst. Appl. 42, 7684–7697 (2015)

    Article  Google Scholar 

  19. Ong, C.S., Huang, J.J., Tzeng, G.H.: Building credit scoring models using genetic programming. Expert Syst. Appl. 29, 41–47 (2005)

    Article  Google Scholar 

  20. Chang, S.-Y., Yeh, T.-Y.: An artificial immune classifier for credit scoring analysis. Appl. Soft Comput. 12, 611–618 (2012)

    Article  Google Scholar 

  21. Li, H., Sun, J., Sun, B.-L.: Financial distress prediction based on OR-CBR in the principle of k-nearest neighbors. Expert Syst. Appl. 36, 643–659 (2009)

    Article  Google Scholar 

  22. Kim, M.-J., Kang, D.-K.: Ensemble with neural networks for bankruptcy prediction. Expert Syst. Appl. 37, 3373–3379 (2010)

    Article  Google Scholar 

  23. Nanni, L., Lumini, A.: An experimental comparison of ensemble of classifiers for bankruptcy prediction and credit scoring. Expert Syst. Appl. 36, 3028–3033 (2009)

    Article  Google Scholar 

  24. Tsai, C., Wu, J.: Using neural network ensembles for bankruptcy prediction and credit scoring. Expert Syst. Appl. 34, 2639–2649 (2008)

    Article  Google Scholar 

  25. Chen, H.-L., Yang, B., Wang, G., Liu, J., Xu, X., Wang, S.-J., Liu, D.-Y.: A novel bankruptcy prediction model based on an adaptive fuzzy k-nearest neighbor method. Knowl. Based Syst. 24, 1348–1359 (2011)

    Article  Google Scholar 

  26. Feldman, D., Gross, S.: Mortgage default: classification trees analysis. J. Real Estate Financ. Econ. 30, 369–396 (2005)

    Article  Google Scholar 

  27. Hand, D.J.: Classifier technology and the illusion of progress. Stat. Sci. 21, 1–14 (2006)

    Article  MathSciNet  Google Scholar 

  28. Sun, J., Li, H., Huang, Q.-H., He, K.-Y.: Predicting financial distress and corporate failure: a review from the state-of-the-art definitions, modeling, sampling, and featuring approaches. Knowl. Based Syst. 57, 41–56 (2014)

    Article  Google Scholar 

  29. Chen, Y.-S., Cheng, C.-H.: Hybrid models based on rough set classifiers for setting credit rating decision rules in the global banking industry. Knowl. Based Syst. 39, 224–239 (2013)

    Article  Google Scholar 

  30. Finlay, S.: Multiple classifier architectures and their application to credit risk assessment. Eur. J. Oper. Res. 210, 368–378 (2011)

    Article  Google Scholar 

  31. Tomczak, J.M., Zięba, M.: Classification restricted Boltzmann machine for comprehensible credit scoring model. Expert Syst. Appl. 42, 1789–1796 (2015)

    Article  Google Scholar 

  32. Setiono, R., Baesens, B., Mues, C.: Recursive neural network rule extraction for data with mixed attributes. IEEE Trans. Neural Netw. 19, 299–307 (2008)

    Article  Google Scholar 

  33. Mues, C., Baesens, B., Files, C.M., Vanthienen, J.: Decision diagrams in machine learning: an empirical study on real-life credit-risk data. Expert Syst. Appl. 27, 257–264 (2004)

    Article  Google Scholar 

  34. Florez-Lopez, R., Ramon-Jeronimo, J.M.: Enhancing accuracy and interpretability of ensemble strategies in credit risk assessment. A correlated-adjusted decision forest proposal. Expert Syst. Appl. 42, 5737–5753 (2015)

    Article  Google Scholar 

  35. Hsieh, N.-C., Hung, L.-P.: A data driven ensemble classifier for credit scoring analysis. Expert Syst. Appl. 37, 534–545 (2010)

    Article  Google Scholar 

  36. Andrews, R., Diederich, J., Tickle, A.: Survey and critique of techniques for extracting rules from trained artificial neural networks. Knowl. Based Syst. 8, 373–389 (1995)

    Article  Google Scholar 

  37. Biswas, S.K., Chakraborty, M., Purkayastha, B., Roy, P., Thounaojam, D.M.: Rule extraction from training data using neural network. Int. J. Artif. Intell. Tools (2017). https://doi.org/10.1142/S0218213017500063

    Article  Google Scholar 

  38. Biswas, S.K., Chakraborty, M., Purkayastha, B.: A rule generation algorithm from neural network using classified and misclassified data. Int. J. Bio-Inspir. Comput. 11, 60–70 (2018)

    Article  Google Scholar 

  39. Bologna, G., Hayashi, Y.: A comparison study on rule extraction from neural network ensembles, boosted shallow trees, and SVMs. Appl. Comput. Intell. Soft Comput. (2018). https://doi.org/10.1155/2018/4084850

    Article  Google Scholar 

  40. Fortuny, E.J.D., Martens, D.: Active learning-based pedagogical rule extraction. IEEE Trans. Neural Netw. Learn. Syst. 26, 2664–2677 (2015)

    Article  MathSciNet  Google Scholar 

  41. Setiono, R.: A penalty-function approach for pruning feedforward neural networks. Neural Comput. 9, 185–204 (1997)

    Article  Google Scholar 

  42. Quinlan, J.R.: Programs for Machine Learning. Morgan Kaufman, San Mateo (1993)

    Google Scholar 

  43. Hayashi, Y., Nakano, S., Fujisawa, S.: Use of the recursive-rule extraction algorithm with continuous attributes to improve diagnostic accuracy in thyroid disease. Inf. Med. Unlock. 1, 1–8 (2015)

    Article  Google Scholar 

  44. Hayashi, Y., Fujisawa, S.: Strategic approach for multiple-MLP ensemble Re-RX algorithm. Proceedings of International Joint Conference on Neural Networks (IJCNN 2015), pp. 669–676. IEEE, Killeany (2015)

    Google Scholar 

  45. Hayashi, Y., Tanaka, Y., Takagi, T., Saito, T., Iiduka, H., Kikuchi, H., Bologna, G., Mitra, S.: Recursive-rule extraction algorithm with J48graft and applications to generating credit scores. J. Artif. Intell. Soft Comput. Res. 6, 35–44 (2015)

    Article  Google Scholar 

  46. Hayashi, Y., Nakano, S.: Use of a recursive-rule extraction algorithm with J48graft to achieve highly accurate and concise rule extraction from a large breast cancer dataset. Inf. Med. Unlock. 1, 9–16 (2015)

    Article  Google Scholar 

  47. Hayashi, Y., Yukita, S.: Rule extraction using recursive-rule extraction algorithm with J48graft combined with sampling selection techniques for the diagnosis of type 2 diabetes mellitus in the Pima Indian dataset. Inf. Med. Unlock. 2, 92–104 (2016)

    Article  Google Scholar 

  48. Chakraborty, M., Biswas, S.K., Purkayastha, B.: recursive rule extraction from NN using reverse engineering technique. New. Gener. Comput. 36, 119 (2018). https://doi.org/10.1007/s00354-018-0031-9

    Article  Google Scholar 

  49. Webb, G.I.: Decision tree grafting from the all-tests-but-one partition. Proceedings of the 16th International Joint Conference on Artificial Intelligence, pp. 702–707. Morgan Kaufmann, Nagoya (1999)

    Google Scholar 

  50. Demšar, J.: Statistical comparisons of classifiers over multiple data sets. J. Mach. Learn. Res. 7, 1–30 (2006)

    MathSciNet  MATH  Google Scholar 

  51. Marqués, A.I., García, V., Sánchez, J.S.: On the suitability of resampling techniques for the class imbalance problem in credit scoring. J. Oper. Res. Soc. 64, 1060–1070 (2013)

    Article  Google Scholar 

  52. Mashayekhi, M., Gras, R.: Rule extraction from decision trees ensembles: new algorithms based on heuristic search and sparse group lasso methods. Int. J. Inf. Technol. Decis. Mak. 16, 1707–1727 (2017)

    Article  Google Scholar 

  53. Hayashi, Y., Fukunaga, K.: Accuracy of rule extraction using a recursive-rule extraction algorithm with continuous attributes combined with a sampling selection technique for the diagnosis of liver disease. Inf. Med. Unlock. 5, 26–38 (2016)

    Article  Google Scholar 

  54. Hornik, K., Stinchcombe, M., White, H.: Multilayer feedforward networks are universal approximators. Neural Netw. 2, 359–366 (1989)

    Article  Google Scholar 

  55. Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Mateo (1999)

    MATH  Google Scholar 

  56. Webb, G.I.: Decision tree grafting. Learning, IJCAI’97 Proceedings of 15th International Conference on Artificial Intelligence (IJCAI), pp. 846–885. Morgan Kaufmann, Nagoya (1997)

    Google Scholar 

  57. Duin, R.P.W., Tax, D.M.J.: Experiments with classifier combining rules. International Workshop on Multiple Classifier Systems. Multiple Classifier Systems, pp. 16–29. Springer, Berlin (2000)

    Chapter  Google Scholar 

  58. Paleologo, G., Elisseeff, A., Antonini, G.: Subagging for credit scoring models. Eur. J. Oper. Res. 201, 490–499 (2010)

    Article  Google Scholar 

  59. Wang, G., Ma, J., Huang, L., Xu, K.: Two credit scoring models based on dual strategy ensemble trees. Knowl. Based Syst. 26, 61–68 (2012)

    Article  Google Scholar 

  60. Yeh, C.-C., Lin, F., Hsu, C.-Y.: A hybrid KMV model, random forests and rough set theory approach for credit rating. Knowl. Based Syst. 33, 166–172 (2012)

    Article  Google Scholar 

  61. Ala’raj, M., Abbod, M.F.: Classifiers consensus system approach for credit scoring. Knowl. Based Syst. 104, 89–105 (2016)

    Article  Google Scholar 

  62. Freund, Y., Schapire, R.E.: Experiments with a new boosting algorithm. Proceedings of the 13th International Conference on Machine Learning, vol. 96, pp. 148–156. Morgan Kaufmann, Nagoya (1996)

    Google Scholar 

  63. Brown, I., Mues, C.: An experimental comparison of classification algorithms for imbalanced credit scoring data sets. Expert Syst. Appl. 39, 3446–3453 (2012)

    Article  Google Scholar 

  64. Tsai, C.-F., Chen, M.-L.: Credit rating by hybrid machine learning techniques. Appl. Soft Comput. 10, 374–380 (2010)

    Article  Google Scholar 

  65. Xia, Y., Liu, C., Li, Y., Liu, N.: A boosted decision tree approach using Bayesian hyper-parameter optimization for credit scoring. Expert Syst. Appl. 78, 225–241 (2017)

    Article  Google Scholar 

  66. Hayashi, Y.: Synergy effects between grafting and subdivision in Re-RX with J48graft for the diagnosis of thyroid disease. Knowl. Based Syst. 131, 170–182 (2017)

    Article  Google Scholar 

  67. Frank, A., Asuncion, A.: Irvine machine learning repository. http://archive.ics.uci.edu/ml/ (2010). Accessed 8 June 2017

  68. Salzberg, S.L.: On comparing classifiers: pitfalls to avoid and a recommended approach. Data Min. Knowl. Discov. 1, 317–328 (1997)

    Article  Google Scholar 

  69. Chen, N., Ribeiro, B., Chen, A.: Financial credit risk assessment: a recent review. Artif. Intell. Rev. 45, 1–23 (2016)

    Article  Google Scholar 

  70. Smith, M.: Neural Networks for Statistical Modeling. Van Nostrand Reinhold, New York (1993)

    MATH  Google Scholar 

  71. Huysmans, J., Setiono, R., Baesens, B., Vanthienen, J.: Minerva: sequential covering for rule extraction. IEEE Trans. Syst. Man Cybern. B Cybern. 38, 299–309 (2008)

    Article  Google Scholar 

  72. Setiono, R., Liu, H.: NeuroLinear: from neural networks to oblique decision rules. Neurocomputing 17, 1–24 (1997)

    Article  Google Scholar 

  73. Odajima, K., Hayashi, Y., Tianxia, G., Setiono, R.: Greedy rule generation from discrete data and its use in neural network rule extraction. Neural Netw. 21, 1020–1028 (2008)

    Article  Google Scholar 

  74. Bologna, G., Hayashi, Y.: QSVM: a support vector machine for rule extraction. In: Rojas, I., Joya, G., Catala, A. (eds.) Advances in Computational Intelligence. IWANN 2015. Lecture Notes in Computer Science, pp. 276–289. Springer, Cham (2015)

    Google Scholar 

Download references

Funding

This work was supported in part by the Japan Society for the Promotion of Science through a Grant-in-Aid for Scientific Research (C) (18K11481).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yoichi Hayashi.

Ethics declarations

Ethical approval

All authors and the responsible authorities where the work was conducted have approved the final manuscript and agree with submission to New Generation Computing.

Conflict of interest

The authors declare that they have no conflict of interest.

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hayashi, Y., Oishi, T. High Accuracy-priority Rule Extraction for Reconciling Accuracy and Interpretability in Credit Scoring. New Gener. Comput. 36, 393–418 (2018). https://doi.org/10.1007/s00354-018-0043-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00354-018-0043-5

Keywords

Navigation