Skip to main content

Advertisement

Log in

FinGAN: Chaotic generative adversarial network for analytical customer relationship management in banking and insurance

  • Original Article
  • Published:
Neural Computing and Applications Aims and scope Submit manuscript

Abstract

Credit card churn prediction, insurance fraud detection, and loan default prediction are all critical analytical customer relationship management (ACRM) problems. Since these events occur infrequently, datasets for these problems are highly unbalanced. Consequently, when trained on such unbalanced datasets, all machine learning classifiers tend to produce high false positive rates. We propose two methods for data balancing. To oversample the minority class, we proposed an innovative GAN called chaoticGAN, where we employed chaotic noise as input for the generator. We also employed the traditional GAN (Goodfellow et al. in Adv Neural Inf Process Syst, 2014. https://doi.org/10.1145/3422622), Wasserstein GAN (Arjovsky et al. in Wasserstein GAN, 2017. https://arxiv.org/abs/1701.07875), and CTGAN (Xu et al. in Modeling Tabular Data using Conditional GAN. https://arxiv.org/pdf/1907.00503) independently for baseline comparison. On the data balanced by GANs, we employed a host of machine learning classifiers, including Random Forest, Decision Tree, Support Vector Machine (SVM), Logistic Regression (LR), multi-layer perceptron (MLP) and Light gradient boosting machine (LGBM) to demonstrate the efficacy of our approaches. In the second approach, we augment the oversampled synthetic minority class data obtained by GAN and its variants with the undersampled majority class data obtained by one class support vector machine (OCSVM) (Tax et al. in Mach Learn 54:45–66, 2014). We passed the entire modified dataset to build the classifiers. Our proposed approaches outperform earlier studies on the same datasets in terms of the area under the ROC curve (AUC). Further, our proposed chaoticGAN and its hybrid turned out to be statistically similar to the state-of-the-art CTGAN on all datasets while being significant over other methods w.r.t AUC over tenfold cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles and news from researchers in related subjects, suggested using machine learning.

Data availability

The Credit card churn prediction and the auto insurance fraud detection datasets analysed during the current study cannot be shared as authors have no permission to do so. However, loan default prediction dataset, which is publicly available can be obtained from the corresponding author on reasonable request.

References

  1. Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial networks. Adv Neural Inf Process Syst. https://doi.org/10.1145/3422622

    Article  Google Scholar 

  2. Arjovsky M, Chintala S, Bottou L (2017) Wasserstein GAN. https://arxiv.org/abs/1701.07875

  3. Xu L, Skoularidou M, Cuesta-Infante A, Veeramachaneni K (2019) Modeling tabular data using conditional GAN. https://arxiv.org/pdf/1907.00503

  4. Tax DM, Duin RP (2004) Support vector data description. Mach Learn 54:45–66. https://doi.org/10.1023/B:MACH.0000008084.60811.49

    Article  MATH  Google Scholar 

  5. Kumar V, Reinartz W (2018) Customer relationship management: concept, strategy, and tools. Springer-Verlag GmbH, Germany

    Book  Google Scholar 

  6. Gangwar AK, Ravi V (2019) Generative adversarial network for oversampling data in credit card fraud detection. In: ICISS, Hyderabad, India pp 123–134

  7. Sisodia DS, Reddy NK (2017) Performance evaluation of class balancing techniques for credit card fraud detection. In: 2017 IEEE international conference on power, control, signals and instrumentation engineering (ICPCSI), pp 2747–2752

  8. Randhawa K, Chu Kiong L, Seera M, Lim C, Nandi A (2018) Credit card fraud detection using AdaBoost and majority voting. IEEE Access. https://doi.org/10.1109/ACCESS.2018.2806420

    Article  Google Scholar 

  9. Dos Santos Tanaka FHK, Aranha C (2019) Data augmentation using GAN. https://arxiv.org/abs/1904.09135

  10. Motinni A, Lheritier A, Acuna-Agost R (2018) Airline passenger name record generation using generative adversarial networks. https://arxiv.org/abs/1807.06657

  11. Fiore U, Santis AD, Perla F, Zanetti P, Palmieri F (2019) Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Inf Sci 479:448–455

    Article  Google Scholar 

  12. Vega-Marquez B, Rubio-Escudero C, Riquelme J, Nepomuceno-Chamorro C (2020) Creation of synthetic data with conditional generative adversarial networks. In: SOCO 2019. AISC. Springer, Cham pp 231–240

  13. Che T, Li Y, Zhang R, Hjelm RD, Li W, Song Y, Bengio Y (2017) Maximum-likelihood augmented discrete generative adversarial networks. https://arxiv.org/abs/1702.07983

  14. Kusner MJ, Hernández-Lobato (2016) JM GANs for sequences of discrete elements with the gumbel-softmax distribution. https://arxiv.org/abs/1611.04051

  15. Ping H, Stoyanovich J, Howe B (2017) Data synthesizer: privacy-preserving synthetic datasets. In: Proceedings of the 29th international conference on scientific and statistical database management. ACM, p 42

  16. Esteban C, Hyland SL, Rätsch G (2017) Real-valued (medical) time series generation with recurrent conditional GANs. https://arxiv.org/abs/1706.02633

  17. Camino R, Hammer-schmidt C (2018) State R Generating multi-categorical samples with generative adversarial networks. https://arxiv.org/abs/1807.01202

  18. Choi E, Biswal S, Malin B, Duke J, Stewart WF, Sun J (2017) Generating multi-label discrete patient records using generative adversarial networks. https://arxiv.org/abs/1703.06490

  19. Patel S, Kakadiya A, Mehta M, Derasari R, Patel R, Gandhi R (2018) Correlated discrete data generation using adversarial training. https://arxiv.org/abs/1804.00925

  20. Park N, Mohammadi M, Gorde K, Jajodia S, Park H, Kim Y (2018) Data synthesis based on generative adversarial networks. Proc VLDB Endow 11(10):1071–1083

    Article  Google Scholar 

  21. Xu L, Veeramachaneni K (2018) Synthesizing tabular data using generative adversarial networks. https://arxiv.org/pdf/1811.11264

  22. Smith KA, Gupta JN (2000) Neural networks in business: techniques and applications for the operations researcher. Comput Oper Res 27(11–12):1023–1044

    Article  MATH  Google Scholar 

  23. Ferreira JB, Vellasco M, Pacheco MA, Barbosa CH (2004) Data mining techniques on the evaluation of wireless churn. In: (ESANN’2004). Proceedings european symposium on artificial neural networks bruges (Belgium), d-sidepublication ISBN 2-930307-04-8, pp 483–488

  24. Kumar DA, Ravi V (2008) Predicting credit card customer churn in banks using data mining. Int J Data Anal Tech Strat 1(1):4–28

    Article  Google Scholar 

  25. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE synthetic minority over-sampling technique. J Artif Intell Res 16:321–357

    Article  MATH  Google Scholar 

  26. Larivie’re B, den Poel (2018) DV Investigating the role of product features in preventing customer churn, by using survival analysis and choice modelling: the case of financial services. Expert Syst Appl 27(2):277–285

  27. Ali OG, ArÕtürk U (2014) Dynamic churn prediction framework with more effective use of rare event data: the case of private banking. Expert Syst Appl 41(17):7880–7903

    Google Scholar 

  28. Verbeke W, Martens D, Mues C, Baesens B (2011) Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst Appl 38(3):2354–2364

    Article  Google Scholar 

  29. Tsai CF, Lu YH (2009) Customer churn prediction by hybrid neural networks. Expert Syst Appl 36(10):12547–12553

    Article  Google Scholar 

  30. Sundarkumar GG, Ravi V (2015) A novel hybrid under-sampling method for mining unbalanced datasets in banking and insurance. Eng Appl Artif Intell 37:368–377

    Article  Google Scholar 

  31. Sundarkumar GG, Ravi V, Siddeshwar V (2015) One-class support vector machine based under-sampling: application to churn prediction and insurance fraud detection. In: 2015 IEEE international conference on computational intelligence and computing research

  32. Farquad MAH, Ravi V, Bapi Raju S (2011) Analytical CRM in banking and finance using SVM: a modified active learning-based rule extraction approach. Int J Electron Cust Relatsh Manag 6(1):48–73

    Google Scholar 

  33. Phua C, Damminda A, Lee V (2004) Minority report in fraud detection: classification of skewed data Issue on Imbalanced datasets. SIGKDD Explor 6(1):50-S9

    Article  Google Scholar 

  34. Sublej L, Furlan S, Bajec M (2011) An expert system for detecting automobile insurance fraud using network analysis. Expert Syst Appl 38(1):1039–1042

    Article  Google Scholar 

  35. Lorenz EN (1963) Deterministic nonperiodic flow. J Atmos Sci 20(2):130–141

    Article  MathSciNet  MATH  Google Scholar 

  36. Dhanya CT, Nagesh Kumar D (2010) Nonlinear ensemble prediction of chaotic daily rainfall. Adv Water Resour 33(3):327–347

    Article  Google Scholar 

  37. Packard NH, Crutchfield JP, Farmer JD, Shaw RS (1980) Geometry from a time series. Phys Rev Lett 45:712

    Article  Google Scholar 

  38. Qasim OS, Thanoon A, Algamal ZY (2020) Feature selection based on chaotic binary black hole algorithm for data classification. Chem Intell Lab Syst 204:104104

    Article  Google Scholar 

  39. Ahmed AE, Mohamed AA, Aboul EH (2019) Chaotic multi-verse optimizer-based feature selection. Neural Comput Appl 31(4):991–1006

    Article  Google Scholar 

  40. Hu J, Heidari AA, Zhang L, Xue X, Gui W, Chen H, Pan Z (2021) Chaotic diffusion‐limited aggregation enhanced grey wolf optimizer: Insights, analysis, binarization, and feature selection. Int J Intell Syst 1–64

  41. Schölkopf B, Williamson RC, Smola A, Shawe-Taylor J, Platt J (1999) Support vector method for novelty detection. Adv Neural Inf Process Syst 12

  42. Jais I, Ismail A, Nisa SQ (2019) Adam optimization algorithm for wide and deep neural network. Knowl Eng Data Sci 2:41. https://doi.org/10.17977/um018v2i12019p41-46

    Article  Google Scholar 

  43. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray D, Steiner B, Tucker P, Vasudevan V, Warden P, Zhang X (2016) TensorFlow: a system for large-scale machine learning

  44. Pedregosa F, Varoquaux G, Gramfort A, Thirion MB, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. 12(85): 2825–2830

  45. Vasu M, Ravi V (2011) A hybrid under-sampling approach for mining unbalanced datasets: application to Banking and insurance. Int J Data Min Model Manag 3(1):75–105

    Google Scholar 

  46. Mudholkar GS, Hutson AD (1996) The exponentiated Weibull family: some properties and a flood data application. Commun Stat Theory Methods 25:3059–3083

    Article  MathSciNet  MATH  Google Scholar 

  47. https://erdogant.github.io/distfit/pages/html/index.html

  48. KStest-https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.kstest.html

  49. Arik SO, Pfister T (2020) TabNet: attentive interpretable tabular learning. https://arxiv.org/abs/1908.07442

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Vadlamani Ravi.

Ethics declarations

Conflict of interest

The authors have no competing interests to declare that are relevant to the content of this article. Further, authors comply with the ethical standards of the Journal.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kate, P., Ravi, V. & Gangwar, A. FinGAN: Chaotic generative adversarial network for analytical customer relationship management in banking and insurance. Neural Comput & Applic 35, 6015–6028 (2023). https://doi.org/10.1007/s00521-022-07968-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00521-022-07968-x

Keywords