Skip to main content

A Data Driven Ensemble Classifier for Credit Scoring Analysis

  • Conference paper
Advances in Knowledge Discovery and Data Mining (PAKDD 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5476))

Included in the following conference series:

Abstract

This study focuses on predicting whether a credit applicant can be categorized as good, bad or borderline from information initially supplied. Given its importance, many researchers have recently worked on an ensemble of classifiers. However, to the best of our knowledge, unrepresentative samples drastically reduce the accuracy of the deployment classifier. Few have attempted to preprocess the input samples into more homogeneous cluster groups and then fit the ensemble classifier accordingly. For this reason, we introduce the concept of class-wise classification as a preprocessing step in order to obtain an efficient ensemble classifier. This strategy would work better than a direct ensemble of classifiers without the preprocessing step. The proposed ensemble classifier is constructed by incorporating several data mining techniques, mainly involving optimal associate binning to discretize continuous values; neural network, support vector machine, and Bayesian network are used to augment the ensemble classifier. In particular, the Markov blanket concept of Bayesian network allows for a natural form of feature selection, which provides a basis for mining association rules.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altman, E.: Financial Ratios, Discriminant Analysis and the Prediction of Corporate Bankruptcy. Journal of Finance 23, 589–609 (1968)

    Article  Google Scholar 

  2. Lawrence, E., Arshadi, N.: A Multinomial Logit Analysis of Problem Loan Resolution Choices in Banking. Journal of Money, Credit and Banking 27, 202–216 (1995)

    Article  Google Scholar 

  3. Charitou, A., Neophytou, E., Charalambous, C.: Predicting Corporate Failure: Empirical Evidence for the UK. European Accounting Review 13(3), 465–497 (2004)

    Article  Google Scholar 

  4. McKee, T.E.: Rough Sets Bankruptcy Prediction Models Versus Auditor Signaling Rates. Journal of Forecasting 22, 569–586 (2003)

    Article  Google Scholar 

  5. Sarkar, S., Sriram, R.S.: Bayesian Models for Early Warning of Bank Failures. Management Science 47(11), 1457–1475 (2001)

    Article  MATH  Google Scholar 

  6. Tsai, C.F., Wu, J.W.: Using Neural Network Ensembles for Bankruptcy Prediction and Credit Scoring. Expert Systems with Applications 34, 2639–2649 (2008)

    Article  Google Scholar 

  7. West, D., Dellana, S., Qian, J.: Neural Network Ensemble Strategies for Financial Decision Applications. Computers and Operations Research 32, 2543–2559 (2005)

    Article  MATH  Google Scholar 

  8. Huang, C.L., Chen, M.C., Wang, C.J.: Credit Scoring with a Data Mining Approach Based on Support Vector Machines. Expert Systems with Applications 33, 847–856 (2007)

    Article  Google Scholar 

  9. Kerber, R.: Chimerge: Discretization of Numeric Attributes. In: Proceedings of the 10th National Conference on Artificial Intelligence, pp. 123–128. MIT Press, Cambridge (1992)

    Google Scholar 

  10. Fayyad, U.M., Irani, K.B.: Multi-interval Discretization of Continuous-Valued Attributes for Classification Learning. In: Proceedings of the 13th International Joint Conference on Artificial Intelligence, Machine Learning, pp. 1022–1027. Morgan Kaufmann, Chambery (1993)

    Google Scholar 

  11. Dougherty, J., Kohavi, R., Sahami, M.: Supervised and Unsupervised Discretization of Continuous Features. In: Prieditis, A., Russell, S. (eds.) Proceedings of the 12th International Conference on Machine Learning, pp. 194–202. Morgan Kaufmann, San Francisco (1995)

    Google Scholar 

  12. Sun, L., Shenoy, P.P.: Using Bayesian Networks for Bankruptcy Prediction: Some Methodological Issues. European Journal of Operational Research 180, 738–753 (2007)

    Article  MATH  Google Scholar 

  13. Oza, N.C., Tumer, K.: Classifier Ensembles: Select Real-World Applications. Information Fusion 9, 4–20 (2008)

    Article  Google Scholar 

  14. Hsieh, N.C.: Hybrid Mining Approach in the Design of Credit Scoring Models. Expert Systems with Applications 28, 655–665 (2005)

    Article  Google Scholar 

  15. Punj, G., Steward, D.W.: Cluster Analysis in Marketing Research: Review and Suggestions for Applications. Journal of Marketing Research 20, 134–148 (1983)

    Article  Google Scholar 

  16. Kittler, J., Hatef, M., Duin, R.W., Matas, J.: On Combining Classifiers. IEEE Transactions on Pattern Analysis and Machine Intelligence 20, 226–239 (1998)

    Article  Google Scholar 

  17. Nilson, N.J.: Learning Machines: Foundations of Trainable Pattern Classifiers. McGraw-Hill, New York (1965)

    Google Scholar 

  18. Islam, M.M., Yao, X., Murase, K.: A Constructive Algorithm for Training Cooperative Neural Network Ensembles. IEEE Transactions on Neural Networks 14, 820–834 (2003)

    Article  Google Scholar 

  19. Salchenberger, L.M., Cianr, E.M., Lash, N.A.: Neural Networks: a New Tool for Predicting Thrift Failures. Decision Sciences 23, 899–916 (1992)

    Article  Google Scholar 

  20. Jaroszewicz, S., Simovici, D.A.: Interestingness of Frequent Itemsets Using Bayesian Networks as Background Knowledge. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 178–186. ACM, Seattle (2004)

    Google Scholar 

  21. Fauré, C., Delprat, S., Boulicaut, J.-F., Mille, A.: Iterative Bayesian Network Implementation by Using Annotated Association Rules. In: Proceedings of the 15th International Conference on Knowledge Engineering and Knowledge Management, pp. 326–333. Springer, Podebrady (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hsieh, NC., Hung, LP., Ho, CL. (2009). A Data Driven Ensemble Classifier for Credit Scoring Analysis. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, TB. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2009. Lecture Notes in Computer Science(), vol 5476. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-01307-2_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-01307-2_33

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-01306-5

  • Online ISBN: 978-3-642-01307-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics