Skip to main content
Log in

Customer credit scoring based on HMM/GMDH hybrid model

  • Regular Paper
  • Published:
Knowledge and Information Systems Aims and scope Submit manuscript

Abstract

Hidden Markov model (HMM) has made great achievements in many fields such as speech recognition and engineering. However, due to its assumption of state conditional independence between observations, HMM has a very limited capacity for recognizing complex patterns involving more than first-order dependencies in customer relationships management. Group Method of Data Handling (GMDH) could overcome the drawbacks of HMM, so we propose a hybrid model by combining the HMM and GMDH to score customer credit. There are three phases in this model: training HMM with multiple observations, adding GMDH into HMM and optimizing the hybrid model. The proposed hybrid model is compared with other exiting methods in terms of average accuracy, Type I error, Type II error and AUC. Experimental results show that the proposed method has better performance than HMM/ANN in two credit scoring datasets. The implementation of HMM/GMDH hybrid model allows lenders and regulators to develop techniques to measure customer credit risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Abdou H, Pointon J, Elmasry A (2008) Neural nets versus conventional techniques in credit scoring in Egyptian banking. Expert Syst Appl 35(3):1275–1292

    Article  Google Scholar 

  2. Aksenova TI, Yurachkovsky YP (1988) A characterisation at unbiased structure and conditions of their J-optimality. Sov J Autom Inf Sci 21(4):36–42

    MATH  Google Scholar 

  3. Anastasakis L, Mort N (2009) Exchange rate forecasting using a combined parametric and nonparametric self-organising modelling approach. Expert Syst Appl 36(10):12001–12011

    Article  Google Scholar 

  4. Anonymous Articles, software, books and presentations about the group method of data handling. http://www.gmdh.net/articles/index.html

  5. Blum AL, Langley P (1997) Selection of relevant features and examples in machine learning. Artif Intell 97(1–2):245–271

    Article  MathSciNet  MATH  Google Scholar 

  6. Bourlard H, Morgan N, Wooters C, Renals S (1992) CDNN: a context dependent neural network for continuous speech recognition. In: IEEE international conference on acoustics, speech, and signal processing, vol 2, pp 349–352

  7. Bourlard H, Wellekens C (1990) Links between Markov models and multilayer perceptrons. IEEE Trans Pattern Anal Mach Intell 12(12):1167–1178

    Article  Google Scholar 

  8. Bystroff C, Thorsson V, Baker D (2000) HMMSTR: a Hidden Markov Model for local sequence-structure correlations in proteins. J Mol Biol 301(1):173–190

    Article  Google Scholar 

  9. Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2(3):1–27

    Article  Google Scholar 

  10. Crook JN, Edelman DB, Thomas LC (2007) Recent developments in consumer credit risk assessment. Eur J Oper Res 183(3):1447–1465

    Article  MathSciNet  MATH  Google Scholar 

  11. Frank A, Asuncion A (2010) UCI machine learning repository. http://archive.ics.uci.edu/ml

  12. Gupta JND, Smith KA (2003) Neural networks in business: techniques and applications. IRM Press, USA

    Google Scholar 

  13. Henley WE, Dj Hand (1997) Construction of a k-nearest-neighbour credit-scoring system. IMA J Manag Math 8(4):305–321

    Article  MATH  Google Scholar 

  14. Ivakhnenko A (1976) The group method of data handling in prediction problems. Sov Autom Control 9(6):21–30

    MathSciNet  Google Scholar 

  15. Ivakhnenko A, Stepashko V (1985) Noise immunity of modeling. Naukova Dumka, Kiev

    Google Scholar 

  16. Joanes DN (1993) Reject inference applied to logistic regression for credit scoring. IMA J Manag Math 5(1):35–43

    Article  Google Scholar 

  17. Kayasith P, Theeramunkong T (2011) Pronouncibility index (\(\rm {\Pi }\)): a distance-based and confusion-based speech quality measure for dysarthric speakers. Knowl Inf Syst 27(3):367–391

    Article  Google Scholar 

  18. Khashman A (2010) Neural networks for credit risk evaluation: investigation of different neural models and learning schemes. Expert Syst Appl 37(9):6233–6239

    Article  Google Scholar 

  19. Kim Y (2006) Toward a successful CRM: variable selection, sampling, and ensemble. Decis Support Syst 41(2):542–553

    Article  Google Scholar 

  20. Laitinen EK (1999) Predicting a corporate credit analyst’s risk estimate by logistic and linear models. Int Rev Financ Anal 8(2):97–121

    Article  Google Scholar 

  21. Lee KF (1988) On large-vocabulary speaker-independent continuous speech recognition. Speech Commun 7(4):375–379

    Article  Google Scholar 

  22. Lee TS, Chiu CC, Chou YC, Lu CJ (2006) Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Comput Stat Data Anal 50(4):1113–1130

    Article  MathSciNet  Google Scholar 

  23. Lee TS, Chiu CC, Lu CJ, Chen IF (2002) Credit scoring using the hybrid neural discriminant technique. Expert Syst Appl 23(3):245–254

    Article  Google Scholar 

  24. Lin SL (2009) A new two-stage hybrid approach of credit risk in banking industry. Expert Syst Appl 36(4):8333–8341

    Article  Google Scholar 

  25. Lukashin AV, Borodovsky M (1998) GeneMark.hmm: new solutions for gene finding. Nucleic Acids Res 26(4):1107–1115

    Article  Google Scholar 

  26. Madala H, Ivakhnenko A (1994) Inductive learning algorithms for complex systems modeling. CRC press, Boca Raton

    MATH  Google Scholar 

  27. Morgan N, Bourlard H (1990) Continuous speech recognition using multilayer perceptrons with Hidden Markov Models. In: International conference on acoustics, speech, and signal processing, vol 1, pp 413–416

  28. Mueller JA, Lemke F (1999) Self-organising data mining: an intelligent approach to extract knowledge from data. ScriptSoftware International, Berlin

    Google Scholar 

  29. Oguz H, Gurgen F (2008) Credit risk analysis using Hidden Markov Model. In: International symposium on computer and information sciences, pp 1–5

  30. Oliveira ALI, Braga PL, Lima RMF, Cornlio ML (2010) GA-based method for feature selection and parameters optimization for machine learning regression applied to software effort estimation. Inf Softw Technol 52(11):1155–1166

    Article  Google Scholar 

  31. Pudil P, Novovicová J, Kittler J (1994) Floating search methods in feature selection. Pattern Recognit Lett 15(11):1119–1125

    Article  Google Scholar 

  32. Rabiner L (1989) A tutorial on Hidden Markov Models and selected applications in speech recognition. In: Proceedings of the IEEE vol 77(2), pp 257–286

  33. Abdel-Aal RE (2005) GMDH-based feature ranking and selection for improved classification of medical data. J Biomed Inform 38(6):456–468

    Article  Google Scholar 

  34. Robinson A (1994) An application of recurrent nets to phone probability estimation. IEEE Trans Neural Netw 5(2):298–305

    Article  Google Scholar 

  35. Rosenberg E, Gleit A (1994) Quantitative methods in credit management: a survey. Oper Res 42(4): 589–613

    Article  MATH  Google Scholar 

  36. Schenk J, Rigoll G (2006) Novel hybrid NN/HMM modelling techniques for on-line handwriting recognition. In: Tenth international workshop on frontiers in handwriting recognition. Suvisoft

  37. Smyth P (1994) Hidden Markov models for fault detection in dynamic systems. Pattern Recognit 27(1):149–164

    Article  Google Scholar 

  38. Srivastava A, Kundu A, Sural S, Majumdar A (2008) Credit card fraud detection using Hidden Markov Model. IEEE Trans Dependable Secur Comput 5(1):37–48

    Article  Google Scholar 

  39. Steiger DM, Sharda R (1996) Analyzing mathematical models with inductive learning networks. Eur J Oper Res 93(2):387–401

    Article  MATH  Google Scholar 

  40. Thomas LC (2000) A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers. Int J Forecast 16(2):149–172

    Article  MATH  Google Scholar 

  41. Trentin E, Gori M (2001) A survey of hybrid ANN/HMM models for automatic speech recognition. Neurocomputing 37(1–4):91–126

    Article  MATH  Google Scholar 

  42. Wang B, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20

    Article  Google Scholar 

  43. Wei H, He J, Tan J (2011) Layered hidden Markov models for real-time daily activity monitoring using body sensor networks. Knowl Inf Syst 29(2):479–494

    Article  Google Scholar 

  44. West D (2000) Neural network credit scoring models. Comput Oper Res 27(11–12):1131–1152

    Article  MATH  Google Scholar 

  45. Westgaard S, van der Wijst N (2001) Default probabilities in a corporate bank portfolio: a logistic model approach. Eur J Oper Res 135(2):338–349

    Article  MATH  Google Scholar 

  46. Xiao J, He CZ (2010) SODM based multiple classifiers fusion and its application in customer classification. J Ind Eng/Eng Manag 24(4):71–77

    Google Scholar 

  47. Young SJ, Evermann G, Gales MJF, Hain T, Kershaw D, Moore G, Odell J, Ollason D, Povey D, Valtchev V, Woodland PC (2006) The HTK book, version 3.4. Cambridge University Engineering Department, Cambridge, UK

  48. Yu L, Wang SY, Lai KK (2008) Credit risk assessment with a multistage neural network ensemble learning approach. Expert Syst Appl 34(2):1434–1444

    Article  Google Scholar 

Download references

Acknowledgments

This research is supported by the Natural Science Foundation of China under Grant Nos. 71071101, 71101100 and 71211130018, New Teachers Fund for Doctor Stations, Ministry of Education under Grant No. 20110181120047, China Postdoctoral Science Foundation under Grant No. 2011M500418, Research Start-up Project of Sichuan University under Grant No. 2010SCU11012.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chang-Zheng He.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Teng, GE., He, CZ., Xiao, J. et al. Customer credit scoring based on HMM/GMDH hybrid model. Knowl Inf Syst 36, 731–747 (2013). https://doi.org/10.1007/s10115-012-0572-z

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10115-012-0572-z

Keywords

Navigation