Skip to main content
Log in

An efficient fraud detection framework with credit card imbalanced data in financial services

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Credit card fraud has adversely impacted market economic order and has broken stakeholders, financial entities, and consumers’ trust and interest. Card fraud losses are increasing annually and billions of dollars are being lost. Therefore, this work provides a framework for fraud card detection to be tackled efficiently. Recently, the imbalanced dataset for fraud card transactions due to the number of ordinary transactions being far greater than the amount of fraud. Before solving the fraud problem, we first have to solve the imbalanced data problem which occurred when one class considerably outnumbers the examples of the other class. So, the classification of fraud come to be very tough as the result may get biased towards the majority group. Thus, this paper aims firstly to use hybrid sampling and oversampling preprocessing techniques to solve the imbalanced data problem, and secondly to resolve the fraud. The performance of the proposed framework is estimated based on different metrics accuracy, precision, and recall in comparing existing algorithms such as KNN, LR, LDA, NB, and CART. The obtained results revealing that when the data is highly imbalanced, the model strives to detect fraudulent transactions. Besides, it can predict positive classes improved significantly, reaching an accuracy of 99.9.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Aditsania A, Saonard AL (2017) Handling imbalanced data in churn prediction using ADASYN and backpropagation algorithm. In: 2017 3rd international conference on science in information technology (ICSITech), pp 533–536. https://doi.org/10.1109/ICSITech.2017.8257170

  2. Ali H, Salleh M, Saedudin R, Hussain K, Mushtaq M (2019) Imbalance class problems in data mining: A review. Indones J Electr Eng Comput Sci 14. https://doi.org/10.11591/ijeecs.v14.i3.pp1552-1563

  3. Ali A, et al. (2019) "Leveraging spatio-temporal patterns for predicting citywide traffic crowd flows using deep hybrid neural networks." 2019 IEEE 25th International Conference on Parallel and Distributed Systems (ICPADS). IEEE

  4. Ali A, Zhu Y, Zakarya M (2021) A data aggregation based approach to exploit dynamic spatio-temporal correlations for citywide crowd flows prediction in fog computing. Multimed Tools Appl 80(20):31401–31433

    Article  Google Scholar 

  5. Ali A, Zhu Y, Zakarya M (2021) Exploiting dynamic spatio-temporal correlations for citywide traffic flow prediction using attention based neural networks. Inf Sci 577:852–870

    Article  MathSciNet  Google Scholar 

  6. Ali A, Zhu Y, Zakarya M (2022) Exploiting dynamic spatio-temporal graph convolutional neural networks for citywide traffic flows prediction. Neural Netw 145:233–247

    Article  Google Scholar 

  7. https://www.kaggle.com/rafjaa/resampling-strategies-for-imbalanced-datasets. Accessed 27 May 2020

  8. Batista G, Prati R, Monard M-C (2004) A Study of the Behavior of Several Methods for Balancing machine Learning Training Data. SIGKDD Explorations 6:20–29. https://doi.org/10.1145/1007730.1007735

    Article  Google Scholar 

  9. Bhattacharyya S, Jha S, Tharakunnel K, Westland J (2011) Data mining for credit card fraud: A comparative study. Decis Support Syst 50:602–613. https://doi.org/10.1016/j.dss.2010.08.008

    Article  Google Scholar 

  10. Ebenuwa SH, Sharif MS, Alazab M, al-Nemrat A (2019) Variance ranking attributes selection techniques for binary classification problem in imbalance data. IEEE Access 7:24649–24666

    Article  Google Scholar 

  11. Hakak S, Alazab M, Khan S, Gadekallu TR, Maddikunta PKR, Khan WZ (2021) An ensemble machine learning approach through effective feature extraction to classify fake news. Futur Gener Comput Syst 117:47–58

    Article  Google Scholar 

  12. Han H, Wang WY, Mao BH (2005) Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. In: Huang DS, Zhang XP, Huang GB (eds) Advances in Intelligent Computing. ICIC 2005. Lecture notes in computer science, vol 3644. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11538059_91

    Chapter  Google Scholar 

  13. He H, Bai Y, Garcia E, Li S (2008) ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning. Proceedings of the International Joint Conference on Neural Networks:1322–1328. https://doi.org/10.1109/IJCNN.2008.4633969

  14. Hou Y, Li B, Li L, Liu J (2019) A density-based under-sampling algorithm for imbalance classification. In: Journal of Physics: Conference Series (Vol. 1302, No. 2, p 022064). IOPPublishing.. https://doi.org/10.1088/1742-6596/1302/2/022064

  15. Jose G, Moreno-Torres TR, Alaiz-Rodríguez R, Chawla NV, Herrera F (2012) A unifying view on dataset shift in classification. Pattern Recognit 45(1):521–530 ISSN 0031-3203

    Article  Google Scholar 

  16. Karim A, Azam S, Shanmugam B, Kannoorpatti K, Alazab M (2019) A comprehensive survey for intelligent spam email detection. IEEE Access 7:168261–168295

    Article  Google Scholar 

  17. Keswani B, Vijay P, Nayak N, Keswani P, Dash S, Sahoo L, Mishra T, Mohapatra A (2020) Adapting Machine Learning Techniques for Credit Card Fraud Detection. https://doi.org/10.1007/978-981-15-1286-5_38

  18. López García Pedro & Masegosa, Antonio & Onieva, Enrique & Osaba, Eneko. (2018). Ensemble and fuzzy techniques applied to imbalanced traffic congestion Datasets: A Comparative Study. https://doi.org/10.1007/978-3-319-91641-5_16

  19. Parkinson de Castro E (2020) An examination of the smote and other smote-based techniques that use synthetic data to oversample the minority class in the context of credit-card fraud classification. Masters Dissertation. Tech Univ Dublin. https://doi.org/10.21427/wj33-n221

  20. Parthasarathy G, Lakshmanan R, JustinDhas Y, Saravanakumar J, Darwin J (2019) Comparative case study of machine learning classification techniques using imbalanced credit card fraud datasets. SSRN Electron J. https://doi.org/10.2139/ssrn.3351584

  21. Pattanayak S, Rout M (2018) Experimental Comparison of Sampling Techniques for Imbalanced Datasets Using Various Classification Models. https://doi.org/10.1007/978-981-10-6875-1_2

  22. Reddy T, et al. (2020) "Antlion re-sampling based deep neural network model for classification of imbalanced multimodal stroke dataset." Multimed Tools Appl: 1–25

  23. Roberston D (2021) The Nelson Report. Available online: https://nilsonreport.com/content_promo.php?id_promo=16 [Last accessed: 18 March 2021)

  24. Singh P, Kar A, Singh Y, Kolekar M, Tanwar S (2020) Recent Innovations in Computing: Proceedings of ICRIC 2019 P. 209:221

  25. Tang MJ, Alazab M, Luo Y (2017) Big data for cybersecurity: vulnerability disclosure trends and dependencies. IEEE Transactions on Big Data 5(3):317–329

    Article  Google Scholar 

  26. Soh WW, Yusuf RM. (2019). Predicting credit card fraud on a imbalanced data. Int J Data Sci Adv Anal 1(1):12–17

  27. Huang Z, Yang C, Chen X, Huang K, Xie Y (2019) Adaptive over-sampling method for classification with application to imbalanced datasets in aluminum electrolysis. Neural Comput Appl 32:7183–7199. https://doi.org/10.1007/s00521-019-04208-7

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Aya Abd El-Naby.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Abd El-Naby, A., Hemdan, E.ED. & El-Sayed, A. An efficient fraud detection framework with credit card imbalanced data in financial services. Multimed Tools Appl 82, 4139–4160 (2023). https://doi.org/10.1007/s11042-022-13434-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13434-6

Keywords

Navigation