Abstract
In the emerging peer-to-peer (P2P) lending industry, risks such as credit risk and default risk will bring huge losses to online lending platforms and investors. Therefore, it is necessary to design a reasonable evaluation index system of credit risk to scientifically evaluate the risk level of borrowers. This paper studies the design of comprehensive evaluation index system for P2P credit risk of “three rural” (i.e., agriculture, rural areas and farmers) borrowers. Concretely, we construct the feature set for P2P credit risk of “three rural” borrowers. Based on the traditional index system, we add the static indexes specific to the agriculture-related borrowers and the dynamic indexes reflect the Internet as the preliminary indexes of the feature set and select the borrowers data of the “Pterosaur loan” platform as the research sample. Then, 35 borrower credit features are extracted as a feature set of credit risk. Then, we present a two-stage feature selection method based on filter and wrapper to select the main features from 35 initial borrower credit features. In the stage of filter, three filter methods are used to calculate the importance of the unbalanced features. In the stage of wrapper, a Lasso-logistic method is proposed to filter the feature subset through heuristic search algorithm. In the end, 21 main independent features are selected according to the classification accuracy, which constitute the evaluation index system of credit risk of “three rural” borrowers.
Similar content being viewed by others
References
Algamal ZY, Lee MH (2015) Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification. Expert Syst Appl 42(23):9326–9332
Bermejo P, Luis DLO, Mez J et al (2012) Fast wrapper feature subset selection in high- dimensional data sets by means of filter re-ranking. Knowl Based Syst 25(1):35–44
Blei DM, Ng AY, Jordan MI et al (2003) Latent dirichlet allocation. J Mach Learn Res 3:993–1022
Chen X (2006) Constructing evaluating indexes system with decision tree method. J Comput Appl 26(2):368–370
Chen FL (2010) Combination of feature selection approaches with SVM in credit scoring. Expert Syst Appl 37(7):4902–4909
Chen Q, Lin FR (2017) A study on the influence of descriptive information on overdue rate of borrowers-based on the analysis of P2P online lending platforms. Sci Manag Res 3:137–145
Dorfleitner G, Priberny C, Schuster S et al (2016) Description-text related soft information in peer-to-peer lending-evidence from two leading European platforms. J Bank Finance 64(4):169–187
Freedman S, Jin GZ (2008) Do social networks solve information problems for peer-to-peer lending? Evidence from Prosper.com. Seth Freedman 1:8–43
Gao Y, Yu SH, Shiue YC (2018) The performance of the P2P finance industry in China. Electron Commer Res Appl 30:138–148
Guo L (2015) Loan Descriptions and Online P2P lending behavior. Harbin Institute of Technology, Harbin
Hancer E (2019) Differential evolution for feature selection: a fuzzy wrapper–filter approach. Soft Comput 23(13):5233–5248
Herzenstein M, Sonenshein S, Dholakia UM (2011) Tell me a good story and I may lend you money: the role of narratives in peer-to-peer lending decisions. J Mark Res 48(2):138–149
Jadhav S, He HM, Jenkins K (2018) Information gain directed genetic algorithm wrapper feature selection for credit rating. Appl Soft Comput 69:541–553
Jiang CQ, Wang RY, Ding Y (2017) The default prediction combined with soft information in online peer-to-peer lending. Chin J Manag Sci 25(11):12–21
Ju QX (2018) Research on the evaluation mechanism of personal credit in the internet era—a case study of sesame credit. Mod Manag Sci 302(5):111–113
Kapetanios G, Zikes F (2018) Time-varying Lasso. Econ Lett 169:1–6
Kim D, Seo D, Cho S, Kang P (2019) Multi-co-training for document classification using various document representations: TF-IDF, LDA, and Doc2Vec. Inf Sci 477:15–29
Kumar S (2007) Bank of one: empirical analysis of peer-to-peer financial marketplaces. In: Proceedings of the 2007 America conference on information systems. AMCIS, USA, pp 1–8
Liu H, Qiao H, Wang SY, Li YZ (2019) Platform competition in peer-to-peer lending considering risk control ability. Eur J Oper Res 274(1):280–290
Mantas CJ, Castellano JG, Moral-García S, Abellán J (2019) A comparison of random forest based algorithms: random credal random forest versus oblique random forest. Soft Comput 23:10739–10754
Mercadier M, Lardy JP (2019) Credit spread approximation and improvement using random forest regression. Eur J Oper Res 277(1):351–365
Michels J (2012) Do unverifiable disclosures matter? Evidence from peer-to-peer lending. Acc Rev 87(4):1385–1413
Mu YH, Liu XD, Wang LD (2018) A Pearson’s correlation coefficient based decision tree and its parallel implementation. Inf Sci 435:40–58
Rao CJ, Xiao XP, Goh M, Zheng JJ, Wen JH (2017) Compound mechanism design of supplier selection based on multi-attribute auction and risk management of supply chain. Comput Ind Eng 105:63–75
Ravina E (2008) Beauty, personal characteristics, and trust in credit markets. Soc Sci Res Netw Electron J 67(1):1–76
Seijo-Pardo B, Alonso-Betanzos A, Bennett KP, Bolón-Canedo V, Guyon I (2019) Biases in feature selection with missing data. Neurocomputing 342:97–112
Solorio-Fernández S, Carrasco-Ochoa J, Martínez-Trinidad JF (2016) A new hybrid filter–wrapper feature selection method for clustering based on ranking. Neurocomputing 214:866–880
Su Y, Cheng CL (2017) An empirical study on the influencing factors of P2P online borrowers’ default behavior. J Financ Dev Res 1:70–76
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc 58(1):267–288
Wang JY (2017) P2P network loan default prediction based on user behavior data. Shanghai Normal University, Shanghai
Wu SW, Wu W, Yang XM, Lu L, Liu K, Jeon G (2019) Multifocus image fusion using random forest and hidden Markov model. Soft Comput 23:9385–9396
Ye X, Dong LA, Ma D (2018) Loan evaluation in P2P lending based on random forest optimized by genetic algorithm with profit score. Electron Commer Res Appl 32:23–36
Yu J (2017) A study on the relationship between descriptive information and default behaviors: the analyze based on P2P lending platform. Contemp Econ Manag 39(5):86–92
Zhang XL, Zhang Q, Chen M, Sun YT, Li H (2018) A two-stage feature selection and intelligent fault diagnosis method for rotating machinery using hybrid filter and wrapper method. Neurocomputing 275:2426–2439
Zhang ZW, He J, Gao GG, Tian YJ (2019) Sparse multi-criteria optimization classifier for credit risk evaluation. Soft Comput 23(9):3053–3066
Acknowledgements
This work is supported by the National Natural Science Foundation of China (No. 71671135), the 2018 Soft Science Research Project of Technology Innovation in Hubei province (No. 2018ADC044) and the 2019 Fundamental Research Funds for the Central Universities (WUT: 2019IB013).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval
This article does not contain any studies with human participants performed by any of the authors.
Informed consent
Informed consent was obtained from all individual participants included in the study.
Additional information
Communicated by V. Loia.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Rao, C., Lin, H. & Liu, M. Design of comprehensive evaluation index system for P2P credit risk of “three rural” borrowers. Soft Comput 24, 11493–11509 (2020). https://doi.org/10.1007/s00500-019-04613-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-019-04613-z