Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference

El Annas, Monir; Benyacoub, Badreddine; Ouzineb, Mohamed

doi:10.1007/s00180-022-01220-9

Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference

Original paper
Published: 14 May 2022

Volume 38, pages 149–169, (2023)
Cite this article

Computational Statistics Aims and scope Submit manuscript

2715 Accesses
4 Citations
Explore all metrics

Abstract

The majority of current credit-scoring models, used for loan approval processing, are generally built on the basis of the information from the accepted credit applicants whose ability to repay the loan is known. This situation generates what is called the selection bias, presented by a sample that is not representative of the population of applicants, since rejected applications are excluded. Thus, the impact on the eligibility of those models from a statistical and economic point of view. Especially for the models used in the peer-to-peer lending platforms, since their rejection rate is extremely high. The method of inferring rejected applicants information in the process of construction of the credit scoring models is known as reject inference. This study proposes a semi-supervised learning framework based on hidden Markov models (SSHMM), as a novel method of reject inference. Real data from the Lending Club platform, the most used online lending marketplace in the United States as well as the rest of the world, is used to experiment the effectiveness of our method over existing approaches. The results of this study clearly illustrate the proposed method’s superiority, stability, and adaptability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Multi Dimensional Hidden Markov Model for Credit Scoring Systems in Peer-To-Peer (P2P) Lending

Credit Scoring Model Based on HMM/Baum-Welch Method

Article 10 June 2021

A Variable Neighborhood Search Algorithmic Approach for Estimating MDHMM Parameters and Application in Credit Risk Evaluation for Online Peer-to-Peer (P2P) Lending

References

Anderson R (2007) The credit scoring toolkit: theory and practice for retail credit risk management and decision automation. Oxford University Press, Oxford
Google Scholar
Anderson B (2019) Using Bayesian networks to perform reject inference. Expert Syst Appl 137:349–356
Article Google Scholar
Banasik J, Crook J (2007) Reject inference, augmentation, and sample selection. Eur J Oper Res 183(3):1582–1594
Article MATH Google Scholar
Banasik J, Crook J (2010) Reject inference in survival analysis by augmentation. J Oper Res Soc 61(3):473–485
Article Google Scholar
Banasik J, Crook J, Thomas LC (2003) Sample selection bias in credit scoring models. JORS 54(8):822–832
MATH Google Scholar
Baum LE, Petrie T, Soules G, Weiss N (1970) A maximization technique occurring in the statistical analysis of probabilistic functions of Markov chains. Ann Math Stat 41:164–71
Article MathSciNet MATH Google Scholar
Bücker M, van Kampen M, Krämer W (2013) Reject inference in consumer credit scoring with nonignorable missing data. J Bank Finance 37(3):1040–1045
Article Google Scholar
Chen GG, Astebro T (2001) The economic value of reject inference in credit scoring. Department of Management Science, University of Waterloo, Waterloo
Google Scholar
Crook J, Banasik J (2004) Does reject inference really improve the performance of application scoring models? J. Bank Finance 28(4):857–874
Article Google Scholar
Demsar J (2006) Statistical comparisons of classifiers over multiple datasets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
El annas M, Ouzineb M, Benyacoub B (2022) Hidden Markov models training using hybrid Baum Welch: variable neighborhood search algorithm. Stat Optim Inf Comput 10(1):160–170
Article MathSciNet Google Scholar
Feelders AJ (1999) Credit scoring and reject inference with mixture models. Intell Syst AccountFinance Manag 8:271–279
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of $m$ rankings. Ann Math Stat 11(1):86–92
Article MathSciNet MATH Google Scholar
García S, Fernández A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180:2044–2064
Article Google Scholar
https://home.kpmg/xx/en/home/insights/2020/02/pulse-of-fintech-archive.html
https://www.lendingclub.com/info/download-data.action
Kang Y, Jia N, Cui R, Deng J (2021) A graph-based semi-supervised reject inference framework considering imbalanced data distribution for consumer credit scoring. Appl Soft Comput 105:107259
Article Google Scholar
Kim A, Cho S-B (2019) An ensemble semi-supervised learning method for predicting defaults in social lending. Eng Appl Artif Intell 81:193–199
Article Google Scholar
Kozodoi N, Katsas P, Lessmann S, Moreira-Matias L, Papakonstantinou K (2019). Shallow self-learning for reject inference in credit scoring. In: Joint European conference on machine learning and knowledge discovery in databases, pp 516–532. Springer
Lessmann S, Baesens B, Seow HV, Thomas LC (2015) Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research. Eur J Oper Res 247:124–136
Article MATH Google Scholar
Levinson SE, Rabiner LR, Sondhi MM (1983) An introduction to the application of the theory of probabilistic functions of Markov process to automatic speech recognition. The Bell Syst Tech J 62:1035–74
Article MathSciNet MATH Google Scholar
Li X, Parizeau M, Plamondon R (2000) Training hidden Markov models with multiple observations-a combinatorial method. IEEE Trans Pattern Anal Mach Intell 22:371–77
Article Google Scholar
Li Z, Tian Y, Li K, Zhou F, Yang W (2017) Reject inference in credit scoring using Semi-supervised support vector machines. Expert Syst Appl 74:105–114
Article Google Scholar
Liu FT, Ting KM, Zhou ZH (2008) Isolation forest. In: 2008 eighth IEEE international conference on data mining. pp 413–422. IEEE
Liu Y, Li X, Zhang Z (2020) A new approach in reject inference of using ensemble learning based on global semi-supervised framework. Futur Gener Comput Syst 109:382–391
Article Google Scholar
Maldonado S, Paredes G (2010) A semi-supervised approach for reject inference in credit scoring using svms. In: Industrial conference on data mining. pp 558–571. Springer
Mancisidor RA, Kampffmeyer M, Aas K, Jenssen R (2020). Deep generative models for reject inference in credit scoring. Knowl-Based Syst, 105758
Marshall A, Tang L, Milne A (2010) Variable reduction, sample selection bias and bank retail credit scoring. J Empir Financ 17(3):501–512
Article Google Scholar
Navas-Palencia G (2020) Optimal binning: mathematical programming formulation. http://arxiv.org/abs/2001.08025
Nemenyi P (1962) Distribution-free multiple comparisons. In: Biometrics, Vol. 18, international biometric Soc 1441 I ST, NW, SUITE 700, Washington, DC 20005-2210, p 263
Shen F, Zhao X, Kou G (2020) Three-stage reject inference learning framework for credit scoring using unsupervised transfer learning and three-way decision theory. Decis Supp Syst 137:113366
Article Google Scholar
Siddiqi N (2017) Intelligent credit scoring: building and implementing better credit risk scorecards, 2nd edn. Wiley, Hoboken, NJ
Book Google Scholar
Sohn S, Shin S (2006) Reject inference in credit operations based on survival analysis. Expert Syst Appl 31(1):26–29
Article Google Scholar
Tian Y, Yong Z, Luo J (2018) A new approach for reject inference incredit scoring using kernel-free fuzzy quadratic surface support vector machines. Appl Soft Comput 73:96–105
Article Google Scholar
Xia Y (2019) A novel reject inference model using outlier detection and gradient boosting technique in peer-to-peer lending. IEEE Access 7:92893–92907
Article Google Scholar
Xia Y, Yang X, Zhang Y (2018) A rejection inference technique based on contrastive pessimistic likelihood estimation for P2P lending. Electron. Commerce Res. Appl. 30:111–124
Article Google Scholar

Download references

Author information

Authors and Affiliations

Institut National de Statistique et d’Economie Appliquée, Rabat, Morocco
Monir El Annas, Badreddine Benyacoub & Mohamed Ouzineb

Authors

Monir El Annas
View author publications
You can also search for this author inPubMed Google Scholar
Badreddine Benyacoub
View author publications
You can also search for this author inPubMed Google Scholar
Mohamed Ouzineb
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Monir El Annas.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

El Annas, M., Benyacoub, B. & Ouzineb, M. Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference. Comput Stat 38, 149–169 (2023). https://doi.org/10.1007/s00180-022-01220-9

Download citation

Received: 13 May 2021
Accepted: 20 March 2022
Published: 14 May 2022
Issue Date: March 2023
DOI: https://doi.org/10.1007/s00180-022-01220-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-supervised adapted HMMs for P2P credit scoring systems with reject inference

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multi Dimensional Hidden Markov Model for Credit Scoring Systems in Peer-To-Peer (P2P) Lending

Credit Scoring Model Based on HMM/Baum-Welch Method

A Variable Neighborhood Search Algorithmic Approach for Estimating MDHMM Parameters and Application in Credit Risk Evaluation for Online Peer-to-Peer (P2P) Lending

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now