Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit

Huang, Anzhong; Wu, Fei

doi:10.1007/s00521-020-05489-z

Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit

S.I. : SPIoT 2020
Published: 11 November 2020

Volume 33, pages 4065–4075, (2021)
Cite this article

Neural Computing and Applications Aims and scope Submit manuscript

Anzhong Huang¹ &
Fei Wu²

567 Accesses
7 Citations
Explore all metrics

Abstract

Some scholars have shown that the machine learning methods based on a single-source data can successfully monitor the risks of formal financial activities, but not those of informal financial activities. This is because the data generated by formal financial activities, whether it is the structured or unstructured data, are of high quality and quantity, while the data generated by informal financial activities are not. Therefore, multi-source data are the key to monitor the risks of informal financial activities through machine learning. Although a few studies attempted to use multi-source data for financial risk prediction, they simply stack the obtained multi-source data, but ignore the original sources, heterogeneity, mutual redundancy and other characteristics of the data, so that the improvement of the prediction effect is not obvious. Therefore, TSAIB_RS method based on the two-stage adaptive integration of multi-source heterogeneous data was constructed in the paper, in which the data with different sources and different distributions were adaptively integrated. In order to test the reliability of TSAIB_RS method, the paper takes the default risk of microcredit in China as the test target and compares the prediction results of various test methods. It concludes that TSAIB_RS method can significantly improve the prediction effects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis

Article 31 May 2017

Zeineb Affes & Rania Hentati-Kaffel

Penalized Independent Factor

Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance

Article 19 April 2016

You Zhu, Chi Xie, … Xin-Guo Yan

References

Rajan RG (1992) Insiders and Outsiders. The choice between informed and Arm’s-length debt. J Finance 47(4):1367–1400
Article Google Scholar
Boot AWA, Thakor AV (1994) Moral Hazard and secured lending in an infinitely repeated credit market game. Int Econ Rev 35(4):899–920
Article Google Scholar
Tsai CF, Hsu Y-F, Yen DC (2014) A comparative study of classifier ensembles for bankruptcy prediction. Appl Soft Comput 24:977–984
Article Google Scholar
Liu X, Xu Z, Yu R (2012) Spatiotemporal variability of drought and the potential climatological driving factors in the Liao River. Hydrol Process 26(1):1–14
Article Google Scholar
West J, Bhattacharya M (2016) Intelligent financial fraud detection: a comprehensive review. Comput Secur 57(47):66
Google Scholar
Nazari M, Alidadi M (2013) Measuring credit risk of bank customers using artificial neural network. J Manag Res 5(5):17
Article Google Scholar
Ghatasheh N (2014) Business analytics using random forest trees for credit risk prediction: a comparison study. Int J Adv Sci Technol 72:19–30
Article Google Scholar
Fanning KM, Cogger KO (1998) Neural network detection of management fraud using published financial data. Int J Intell Syst Account Finance Manag 7(1):21–41
Article Google Scholar
Bhattacharyya S, Jha S, Tharakunnel K (2011) Data mining for credit card fraud: a comparative study. Decis Support Syst 50(3):602–613
Article Google Scholar
Sahin Y, Bulkan S, Duman E (2013) A cost-sensitive decision tree approach for fraud detection. Expert Syst Appl 40(15):5916–5923
Article Google Scholar
Huang Anzhong (2018) A risk detection system of e-commerce: researches based on soft information extracted by affective computing web texts. Electronic Commerce Res 18:143–157
Article Google Scholar
Guo Y, Zhou W, Luo C, Liu C, Xiong H (2016) Instance-based credit risk assessment for investment decisions in P2P lending. Eur J Oper Res 249(2):417–426
Article MathSciNet Google Scholar
Serrano-Cinca C, Gutiérrez-Nieto B (2016) The use of profit scoring as an alternative to credit scoring systems in peer-to-peer (P2P) lending. Decis Support Syst 89:113–122
Article Google Scholar
Estrada F (2011) Theory of financial risk. University Library of Munich, Munich
Google Scholar
Chen D, Han C (2012) A comparative study of online P2P lending in the USA and China. J Internet Bank Commerce 17(2):1–15
Google Scholar
Chen N, Ribeiro B, Chen A (2016) Financial credit risk assessment: a recent review. Artif Intell Rev 45(1):1–23
Article Google Scholar
Ge R, Feng J, Gu B, Zhang P (2017) Predicting and deterring default with social media information in peer-to-peer lending. J Manag Inf Syst 34(2):401–424
Article Google Scholar
Ma L, Zhao X, Zhou Z, Liu Y (2018) A new aspect on P2P online lending default prediction using meta-level phone usage data in China. Decis Support Syst 111:60–71
Article Google Scholar
Meier L, Van De Geer S, Bühlmann P (2008) The group lasso for logistic regression. J R Statist Soc Ser B (Statist Methodol) 70(1):53–71
Article MathSciNet Google Scholar
Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group lasso. J Comput Graph Statist 22(2):231–245
Article MathSciNet Google Scholar
Yang J-B, Xu D-L (2013) Evidential reasoning rule for evidence combination. Artif Intell 205:1–29
Article MathSciNet Google Scholar
Zhou L, Tam KP, Fujita H (2016) Predicting the listing status of Chinese listed companies with multi-class classification models. Inf Sci 328:222–236
Article Google Scholar
Loughran T, Mc Donald B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10 Ks. J Finance 66(1):35–65
Article Google Scholar
Simian D, Stoica F, Bărbulescu A (2020) Automatic optimized support vector regression for financial data prediction. Neural Comput Appl 32:2383–2396
Article Google Scholar
Xu Z, Cheng C, Sugumaran V (2020) Big data analytics of crime prevention and control based on image processing upon cloud computing. J Surveill Secur Saf 1:16–33
Google Scholar
du Jardin P (2016) A two-stage classification technique for bankruptcy prediction. Eur J Oper Res 254(1):236–252
Article Google Scholar

Download references

Acknowledgements

The paper is one of mid-term results of the humanities and social science planning project funded by the ministry of education of PRC, named “Researches of the Formation Mechanism of Low Efficiency of Poverty Alleviation of Microcredit and Innovation Practice Model in Jiangsu Province” (20YJA790028), a major project of philosophy and social science research in Jiangsu universities, “Researches on the Optimization of Fintech Innovation Supervision Path (2019SJZDA060)” as well as Anhui Province Social Science Association Project (2019CX079).

Author information

Authors and Affiliations

School of Economics and Management, Jiangsu University of Science and Technology, Zhenjiang, 212003, China
Anzhong Huang
School of Law, Shanghai University of Finance and Economics, Shanghai, 200433, China
Fei Wu

Authors

Anzhong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Fei Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Wu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Huang, A., Wu, F. Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit. Neural Comput & Applic 33, 4065–4075 (2021). https://doi.org/10.1007/s00521-020-05489-z

Download citation

Received: 16 August 2020
Accepted: 27 October 2020
Published: 11 November 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s00521-020-05489-z

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit

Abstract

Access this article

Similar content being viewed by others

Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis

Penalized Independent Factor

Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Two-stage adaptive integration of multi-source heterogeneous data based on an improved random subspace and prediction of default risk of microcredit

Abstract

Access this article

Similar content being viewed by others

Predicting US Banks Bankruptcy: Logit Versus Canonical Discriminant Analysis

Penalized Independent Factor

Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China’s SME credit risk in supply chain finance

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation