Skip to main content

A Novel System Architecture for Anomaly Detection for Loan Defaults

  • Conference paper
  • First Online:
Distributed Computing and Artificial Intelligence, 20th International Conference (DCAI 2023)

Abstract

Given the rise in loan defaults, especially after the onset of the COVID-19 pandemic, it is necessary to predict if customers might default on a loan for risk management. This paper proposes an early warning system architecture using anomaly detection based on the unbalanced nature of loan default data in the real world. Most customers do not default on their loans; only a tiny percentage do, resulting in an unbalanced dataset. We aim to evaluate potential anomaly detection methods for their suitability in handling unbalanced datasets. We conduct a comparative study on different classification and anomaly detection approaches on a balanced and an unbalanced dataset. The classification algorithms compared are logistic regression and stochastic gradient descent classification. The anomaly detection methods are isolation forest and angle-based outlier detection (ABOD). We compare them using standard evaluation metrics such as accuracy, precision, recall, F1 score, training and prediction time, and area under the receiver operating characteristic (ROC) curve. The results show that these anomaly detection methods, particularly isolation forest, perform significantly better on unbalanced loan default data and are more suitable for real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. 2023 U.S. Lev Loan Default Forecast Raised to 2.0%–3.0%; 2024 Projected at 3.0%–4.0%. https://www.fitchratings.com/site/pr/10213716

  2. Canada’s biggest banks set aside \$2.5 billion to cover an expected wave of loan defaults. https://www.thestar.com/business/2023/03/03/canadas-big-six-banks-set-aside-25-billion-as-they-prepare-for-credit-losses.html

  3. CIBC - Annual Report 2022. https://www.cibc.com/content/dam/cibc-public-assets/about-cibc/investor-relations/pdfs/quarterly-results/2022/ar-22-en.pdf

  4. Metrics to Evaluate your Machine Learning Algorithm. https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234

  5. Royal Bank of Canada - Annual Report 2022. https://www.rbc.com/investor-relations/_assets-custom/pdf/ar_2022_e.pdf

  6. Dhaker, M.: L &T Vehicle Loan Default Prediction Data. Kaggle (2019). https://www.kaggle.com/datasets/mamtadhaker/lt-vehicle-loan-default-prediction

  7. Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 23(3), 462–466 (1952)

    Article  MathSciNet  MATH  Google Scholar 

  8. Kriegel, H.P., Schubert, M., Zimek, A.: Angle-based outlier detection in high-dimensional data. In: 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 444–452. ACM (2008). https://doi.org/10.1145/1401890.1401946

  9. Liu, F.T., Ting, K.M., Zhou, Z.H.: Isolation forest. In: 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422 (2008). https://doi.org/10.1109/ICDM.2008.17

  10. McCullagh, P., Nelder, J.A.: Generalized Linear Models. Chapman and Hall (1983)

    Google Scholar 

  11. Mukherjee, P., Badr, Y.: Detection of defaulters in P2P lending platforms using unsupervised learning. In: 2022 IEEE International Conference on Omni-layer Intelligent Systems (COINS), pp. 1–5 (2022). https://doi.org/10.1109/COINS54846.2022.9854964

  12. Mulero Chaves, J., De Cola, T.: Public warning applications: requirements and examples. In: Câmara, D., Nikaein, N. (eds.) Wireless Public Safety Networks 3, pp. 1–18. Elsevier (2017). https://doi.org/10.1016/B978-1-78548-053-9.50001-9

  13. Nigmonov, A., Shams, S.: COVID-19 pandemic risk and probability of loan default: evidence from marketplace lending market. Financ. Innov. 7(1), 1–28 (2021). https://doi.org/10.1186/s40854-021-00300-x

    Article  Google Scholar 

  14. Qiu, H., Tu, Y., Zhang, Y.: Anomaly detection for power consumption patterns in electricity early warning system. In: 2018 Tenth International Conference on Advanced Computational Intelligence (ICACI), pp. 867–873 (2018). https://doi.org/10.1109/ICACI.2018.8377577

  15. Rao, C., Liu, Y., Goh, M.: Credit risk assessment mechanism of personal auto loan based on PSO-XGBoost model. In: Complex & Intelligent Systems. Springer Science and Business Media LLC (2022). https://doi.org/10.1007/s40747-022-00854-y

  16. Siddhartha, M.: Bondora peer-to-peer lending data. IEEE Dataport (2020). https://doi.org/10.21227/33kz-0s65

  17. Song, Y., Wang, Y., Ye, X., Zaretzki, R., Liu, C.: Loan default prediction using a credit rating-specific and multi-objective ensemble learning scheme. Inf. Sci. 629, 599–617 (2023). https://doi.org/10.1016/j.ins.2023.02.014

    Article  Google Scholar 

  18. Zhu, Q., Ding, W., Xiang, M., Hu, M., Zhang, N.: Loan default prediction based on convolutional neural network and LightGBM. In: International Journal of Data Warehousing and Mining (IJDWM), vol. 19, pp. 1–16. IGI Global (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rayhaan Pirani .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Pirani, R., Kobti, Z. (2023). A Novel System Architecture for Anomaly Detection for Loan Defaults. In: Ossowski, S., Sitek, P., Analide, C., Marreiros, G., Chamoso, P., Rodríguez, S. (eds) Distributed Computing and Artificial Intelligence, 20th International Conference. DCAI 2023. Lecture Notes in Networks and Systems, vol 740. Springer, Cham. https://doi.org/10.1007/978-3-031-38333-5_14

Download citation

Publish with us

Policies and ethics