Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques

Sariannidis, Nikolaos; Papadakis, Stelios; Garefalakis, Alexandros; Lemonakis, Christos; Kyriaki-Argyro, Tsioptsia

doi:10.1007/s10479-019-03188-0

Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques

S.I.: BALCOR-2017
Published: 15 March 2019

Volume 294, pages 715–739, (2020)
Cite this article

Annals of Operations Research Aims and scope Submit manuscript

Nikolaos Sariannidis¹,
Stelios Papadakis²,
Alexandros Garefalakis²,
Christos Lemonakis² &
…
Tsioptsia Kyriaki-Argyro³

1510 Accesses
20 Citations
Explore all metrics

Abstract

Effective and thorough credit-risk management is a key factor for lending institutions, as significant financial losses can arise from the borrowers’ default. Consequently, machine learning methods can measure and analyze credit risk objectively when at the same time they face increasingly attention. This study analyzes default payment data from a credit cards’ portfolio containing some 30,000 clients from Taiwan with twenty-three attributes and with no missing information. We compare prediction accuracy of seven classification methods used, i.e. KNN, Logistic Regression, Naïve Bayes, Decision Trees, Random Forest, SVC, and Linear SVC. The results indicate that only few out of most of the typical variables used can adequately analyze default characteristics in terms of lending decisions. The results provide effective feedback to credit evaluators, lending institutions and business analysts for in-depth analysis. Also, they mention to the importance of the precautionary borrowing techniques to be used to better understand credit-card borrowers’ behavior, along with specific accounting, historical and demographical characteristics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning predictivity applied to consumer creditworthiness

Article Open access 15 November 2020

Using Big Data to Compare Classification Models for Household Credit Rating in Kuwait

Performance Evaluation of Traditional Classifiers on Prediction of Credit Recovery

Data availability

The data set is based on the publicly available credit card default data set from the UCI Machine Learning Repository. Details are here: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients.

References

Aha, D. (1992). Tolerating noisy, irrelevant, and novel attributes in instance-based learning algorithms. International Journal of Man–Machine Studies, 36(2), 267–287.
Article Google Scholar
Ajay, V., & Shomona, G. J. (2016). Prediction of credit-card defaulters: a comparative study on performance of classifiers. International Journal of Computer Applications (0975–8887), 145(7), 36–41.
Article Google Scholar
Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2001). Support vector clustering. Journal of Machine Learning Research, 2, 125–137.
Google Scholar
Bhaduri, A. (2009). Credit scoring using artificial immune system algorithms: a comparative study. In Proceedings of the world congress on nature and biologically inspired computing NaBIC2009, Coimbatore (pp. 1540–1543).
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
Article Google Scholar
Cheng D., Zhang S., Deng Z., Zhu Y., & Zong M. (2014). kNN algorithm with data-driven k value. In: Luo X., Yu J. X., & Li Z. (Eds.), Advanced data mining and applications. ADMA 2014. Lecture Notes in Computer Science (Vol. 8933). Berlin: Springer.
Davis, R. H., Edelman, D. B., & Gammerman, A. J. (1992). Machine-learning algorithms for credit-card applications. Journal of Management Mathematics, 4(1), 43–51.
Google Scholar
Dimitras, A., Papadakis, S., & Garefalakis, A. (2017). Evaluation of empirical attributes for credit risk forecasting from numerical data. Investment Management and Financial Innovations, 14(1), 9–18. https://doi.org/10.21511/imfi.14(1).2017.01.
Article Google Scholar
Frank, E., & Witten, I. H. (1998). Generating accurate rule sets without global optimization. In J. Shavlik (Ed.), Proceedings of the fifteenth international conference on machine learning, Madison, WI. San Francisco: Morgan Kaufmann (pp. 144–151).
Frank, E., & Hall, M. (2001). A simple approach to ordinal classification. In L. de Raedt, & P. A. Flach (Eds.), Proceedings of the twelfth European conference on machine learning, Freiburg, Germany. Berlin: Springer (pp. 145–156).
Hamori, S., Kawai, M., Kume, T., Murakami, Y., & Watanabe, Y. (2018). Ensemble learning or deep learning? Application to default risk analysis. Journal of Risk and Financial Management, 11(1), 12. https://doi.org/10.3390/jrfm11010012.
Article Google Scholar
Hand, D. J., & Henley, W. E. (1996). A k-nearest-neighbour classifier for assessing consumer credit risk. The Statistician, 45(1), 77–95.
Article Google Scholar
He, J., Liu, X., Shi, Y., Xu, W., & Yan, N. (2004). Classifications of credit cardholder behavior by using fuzzy linear programming. International Journal of Information Technology and Decision Making, 3(4), 633–650.
Article Google Scholar
Jenhani, I., Nahla, B. A., & Ziedm, E. (2008). Decision trees as possibilistic classifiers (Special Section on Choquet Integration in honor of Gustave Choquet (1915–2006) and Special Section on Nonmonotonic and Uncertain Reasoning). International Journal of Approximate Reasoning, 48(3), 784–807.
Article Google Scholar
Khandani, A. E., Kim, A. J., & Lo, A. W. (2010). Consumer credit risk models via machine-learning Algorithms. AFA 2011 Denver Meetings Paper. https://doi.org/10.2139/ssrn.1568864.
Krichene, A. (2017). Using a naive Bayesian classifier methodology for loan risk assessment evidence from a Tunisian commercial bank. Journal of Economics, Finance and Administrative Science, 22(42), 3–24.
Article Google Scholar
Landwehr, N., Hall, M., & Frank, E. (2003). Logistic model trees. In N. Lavrac, D. Gamberger, L. Todorovski, & H. Blockeel (Eds.), Proceedings of the fourteenth European conference on machine learning, Cavtat-Dubrovnik, Croatia. Berlin: Springer (pp. 241–252).
Lee, T. S., Chiu, C. C., Chou, Y. C., & Lu, C. J. (2006). Mining the customer credit using classification and regression tree and multivariate adaptive regression splines. Computational Statistics and Data Analysis, 50, 1113–1130.
Article Google Scholar
Lichman, M. (2013). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine: University of California, School of Information and Computer Science. The original dataset can be found at the UCI Machine Learning Repository, i.e. https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
Makalic, E., & Schmidt, D. F. (2010). Review of modern logistic regression methods with application to small and medium sample size problems. In Li, J. (Eds.), AI 2010: advances in artificial intelligence. AI 2010. Lecture Notes in Computer Science, (Vol. 6464). Berlin: Springer.
Marinakis, Y., Marinaki, M., Doumpos, M., & Zopounidis, C. (2009). Ant colony and particle swarm optimization for financial classification problems. Expert Systems with Applications, 36, 10604–10611.
Article Google Scholar
Neema, S., & Soibam, B. (2017). The comparison of machine learning methods to achieve most cost-effective prediction for credit card default. Journal of Management Science and Business Intelligence, 2(2), 36–41.
Google Scholar
Peng, Y., Kou, G., Chen, Z., & Shi, Y. (2004). Cross-validation and ensemble analyses on multiple-criteria linear programming classification for credit cardholder behavior, Lecture Notes in Computer Science, ICCS 2004 (Vol. 3039, pp. 931–939).
Quinlan, J., Rajendra, G., & Castro, D. (1998). Bank collateralised loan obligations: From 0 to 60 in less than 2 years? Merrill Lynch, Global Securities Research & Economics Group, March.
Ramoni, M., & Sebastiani, P. (2001). Robust Bayes classifiers. Artificial Intelligence, 125(1–2), 209–226.
Article Google Scholar
Shen, A., Tong, R., & Deng, Y. (2007). Application of classification models on credit card fraud detection. In Proceedings of the international conference on service systems and service management, Chengdu (pp. 1–4).
Shi, Y., Peng, Y., Kou, G., & Chen, Z. (2005). Classifying credit card accounts for business intelligence and decision making: A multiple-criteria quadratic programming approach. International Journal of Information Technology and Decision Making, 4(4), 581–599.
Article Google Scholar
Shomona, J. G., & Ramani, R. G. (2011). Discovery of knowledge patterns in clinical data through data mining algorithms: Multi-class categorization of breast tissue data. International Journal of Computer Applications, 32(7), 46–53.
Google Scholar
Srinivasan, V., & Kim, Y. H. (1987). Credit granting: A comparative analysis of classification procedures. The Journal of Finance, 42(3), 665–681.
Article Google Scholar
Stone, M. (1974). Cross-validation choice and assessment of statistical predictions. Journal of the Royal Statistical Society B, 36, 111–147.
Google Scholar
Watanabe, C. Y. V., Ribeiro, M. X., Traina, C., & Traina, A. J. M. (2011). SACMiner: A new classification method based on statistical association rules to mine medical images. In: J. Filipe, & J. Cordeiro (Eds.), Enterprise information systems. ICEIS 2010. Lecture Notes in Business Information Processing (Vol. 73). Berlin: Springer.
Yeh, I.-C., & Lien, C. (2009). The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36(2, Part 1), 2473–2480.

Download references

Acknowledgements

The current publication is based on the following dataset: Lichman (Lichman 2013). We would also like to thank the Laboratory of Artificial Intelligence Systems and Computer Architectures of the Technological Educational Institute of Crete for providing the computer power to complete extensive experimental results for the needs of this work

Author information

Authors and Affiliations

Department of Finance and Accounting, Western Macedonia University οf Applied Sciences, Kozani, Greece
Nikolaos Sariannidis
Department of Business Administration, Technological Educational Institute of Crete, Agios Nikolaos Branch, Heraklion, Crete, Greece
Stelios Papadakis, Alexandros Garefalakis & Christos Lemonakis
Department of Accounting and Finance, Western Macedonia University οf Applied Sciences, Kozani, Greece
Tsioptsia Kyriaki-Argyro

Authors

Nikolaos Sariannidis
View author publications
You can also search for this author in PubMed Google Scholar
Stelios Papadakis
View author publications
You can also search for this author in PubMed Google Scholar
Alexandros Garefalakis
View author publications
You can also search for this author in PubMed Google Scholar
Christos Lemonakis
View author publications
You can also search for this author in PubMed Google Scholar
Tsioptsia Kyriaki-Argyro
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nikolaos Sariannidis.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sariannidis, N., Papadakis, S., Garefalakis, A. et al. Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques. Ann Oper Res 294, 715–739 (2020). https://doi.org/10.1007/s10479-019-03188-0

Download citation

Published: 15 March 2019
Issue Date: November 2020
DOI: https://doi.org/10.1007/s10479-019-03188-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques

Abstract

Access this article

Similar content being viewed by others

Machine learning predictivity applied to consumer creditworthiness

Using Big Data to Compare Classification Models for Household Credit Rating in Kuwait

Performance Evaluation of Traditional Classifiers on Prediction of Credit Recovery

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Default avoidance on credit card portfolios using accounting, demographical and exploratory factors: decision making based on machine learning (ML) techniques

Abstract

Access this article

Similar content being viewed by others

Machine learning predictivity applied to consumer creditworthiness

Using Big Data to Compare Classification Models for Household Credit Rating in Kuwait

Performance Evaluation of Traditional Classifiers on Prediction of Credit Recovery

Data availability

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation