Skip to main content

Advertisement

Log in

Constructing and Understanding Customer Spending Prediction Models

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Prediction models are being used more and more widely in many sectors. FinTech (Financial Technology) is not an exception. Many problems in FinTech can be considered prediction problems. Some notable examples are predicting the probability that a transaction is fraudulent or predicting the most suitable company to invest in, given some constraints. In this research, the focus is on customer spending prediction. More specifically, we are interested in knowing how much a customer may spend in a period given her past purchases. Such information is crucial for the optimal planning and budgeting of businesses. As a first step in tackling this prediction problem, this research explores the feasibility of different statistical methods and machine learning algorithms in accurately predicting customer spending. The subjects we investigate in this research include Beta Geometric/Negative Binomial Distribution (BG/NBD), Gamma–Gamma, Linear Regression, Random Forest, and Light Gradient Boosting Machine (LightGBM). To make the prediction models and their results more accessible to the average users, we utilize information visualization as the primary communication with human users. We hope doing so can bridge the gap between prediction performance and users’ insight into the reasons behind the performance. With better insight, users can make more appropriate decisions in selecting a method/algorithm to build a prediction model under a specific circumstance. The result of this research can also serve as a foundation for more in-depth work on the same problem in the future.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

References

  1. Stringfellow A, Nie W, Bowen DE. CRM: Profiting from understanding customer needs. Bus Horiz. 2004;47(5):45–52.

    Article  Google Scholar 

  2. Otto PE, et al. From spending to understanding: analyzing customers by their spending behavior. J Retail Consum Serv. 2009;16(1):10–8.

    Article  Google Scholar 

  3. Hall RE. Stochastic implications of the life cycle-permanent income hypothesis: theory and evidence. J Polit Econ. 1978;86(6):971–87.

    Article  Google Scholar 

  4. Campbell JY, Mankiw NG. Permanent income, current income, and consumption. J Business Econ Stat. 1990;8(3):265–79.

    Google Scholar 

  5. Shea J. Myopia, liquidity constraints, and aggregate consumption: a simple test. J Money, Credit, Bank. 1995;27(3):798–805.

    Article  Google Scholar 

  6. Mehra YP, Martin E. Why does consumer sentiment predict household spending? FRB Richmond Economic Quarterly. 2003;89(4):51–67.

    Google Scholar 

  7. Fornell C, Rust RT, Dekimpe MG. The effect of customer satisfaction on consumer spending growth. J Mark Res. 2010;47(1):28–35.

    Article  Google Scholar 

  8. Castéran H, Meyer-Waarden L, Reinartz W. Modeling customer lifetime value, retention, and churn. In: Castéran H, Meyer-Waarden L, Reinartz W, editors. Handbook of market research. Cham: Springer International Publishing; 2021. p. 1001–33.

    Google Scholar 

  9. Gupta S, Lehmann DR, Stuart JA. Valuing customers. J Market Res. 2004;41(1):7–18.

    Article  Google Scholar 

  10. Cui D, Curry D. Prediction in marketing using the support vector machine. Mark Sci. 2005;24(4):595–615.

    Article  Google Scholar 

  11. Chen PP et al. Customer lifetime value in video games using deep learning and parametric models. In: 2018 IEEE international conference on big data (big data). IEEE, (2018).

  12. Xie Y, et al. Customer churn prediction using improved balanced random forests. Expert Syst Appl. 2009;36(3):5445–9.

    Article  Google Scholar 

  13. Tsai C-F, Yu-Hsin Lu. Customer churn prediction by hybrid neural networks. Expert Syst Appl. 2009;36(10):12547–53.

    Article  Google Scholar 

  14. Huang B, Kechadi MT, Buckley B. Customer churn prediction in telecommunications. Expert Syst Appl. 2012;39(1):1414–25.

    Article  Google Scholar 

  15. Qiu J, Lin Z, Li Y. Predicting customer purchase behavior in the e-commerce context. Electron Commer Res. 2015;15:427–52.

    Article  Google Scholar 

  16. Martínez A, et al. A machine learning framework for customer purchase prediction in the non-contractual setting. Eur J Operational Res. 2020;281(3):588–96.

    Article  Google Scholar 

  17. Preece A. Asking ‘Why’in AI: Explainability of intelligent systems–perspectives and challenges. Intell Syst Account Finance Manag. 2018;25(2):63–72.

    Article  Google Scholar 

  18. Páez A. The pragmatic turn in explainable artificial intelligence (XAI). Mind Mach. 2019;29(3):441–59.

    Article  Google Scholar 

  19. Vilone G, Longo L. Explainable artificial intelligence: a systematic review. arXiv preprint arXiv:2006.00093 (2020).

  20. Amershi, S, et al. Guidelines for human-AI interaction. Proceedings of the 2019 chi conference on human factors in computing systems. 2019.

  21. Mengchen L, et al. Towards better analysis of deep convolutional neural networks. IEEE Trans Visual Comput Graphics. 2016;23(1):91–100.

    Google Scholar 

  22. Kahng M, et al. A cti v is: Visual exploration of industry-scale deep neural network models. IEEE Trans Visual Comput Graphics. 2017;24(1):88–97.

    Article  Google Scholar 

  23. Spitzer M, et al. BoxPlotR: a web tool for generation of box plots. Nat Methods. 2014;11(2):121–2.

    Article  Google Scholar 

  24. Keim DA, et al. Generalized scatter plots. Inf Visual. 2010;9(4):301–11.

    Google Scholar 

  25. Li Y, et al. Drawing and studying on histogram. Cluster Comput. 2019;22(Suppl 2):3999–4006.

    Article  Google Scholar 

  26. Shneiderman B. Tree visualization with tree-maps: 2-d space-filling approach. ACM Trans Graph. 1992;11(1):92–9.

    Article  MATH  Google Scholar 

  27. Cockburn A, McKenzie B. An evaluation of cone trees. People and Computers XIV—Usability or Else! Proceedings of HCI 2000. Springer London (2000).

  28. Inselberg, A, Dimsdale B. Parallel coordinates: a tool for visualizing multi-dimensional geometry. In: Proceedings of the first IEEE conference on visualization: visualization 90. IEEE, (1990).

  29. Tran TD, Dang TK. Visualization of web form submissions for security analysis. Int J Web Inf Syst. 2013;9(2):165–80.

    Article  Google Scholar 

  30. Tran TD, TK Dang, Nguyen Le T-G. Interactive Visual Decision tree for developing detection rules of attacks on web applications. Int J Adv Comput Sci Appl 2018;9(7).

  31. Marill KA. Advanced statistics: linear regression, part I: simple linear regression. Acad Emerg Med. 2004;11(1):87–93.

    Article  Google Scholar 

  32. Lu Y, et al. The state-of-the-art in predictive visual analytics. Comput Graph Forum. 2017;36(3):539–62.

    Article  Google Scholar 

  33. Ren D, et al. Squares: Supporting interactive performance analysis for multiclass classifiers. IEEE Trans Visual Comput Graphics. 2016;23(1):61–70.

    Article  Google Scholar 

  34. Steed CA, et al. CrossVis: A visual analytics system for exploring heterogeneous multivariate data with applications to materials and climate sciences. Graph Visual Comput. 2020;3:200013.

    Article  Google Scholar 

  35. Fader PS, Hardie BGS, Lee KL. Counting your customers the easy way: An alternative to the Pareto/NBD model. Mark Sci. 2005;24(2):275–84.

    Article  Google Scholar 

  36. Schmittlein DC, Morrison DG, Colombo R. Counting your customers: Who-are they and what will they do next? Manage Sci. 1987;33(1):1–24.

    Article  Google Scholar 

  37. Fader PS, Hardie BGS. The Gamma-Gamma model of monetary value. February. 2013;2:1–9.

    Google Scholar 

  38. Yuan M, et al. Dimension reduction and coefficient estimation in multivariate linear regression. J R Stat Soc : Series B Stat Methodol. 2007;69(3):329–46.

    Article  MathSciNet  MATH  Google Scholar 

  39. Aiken LS, West SG, Pitts SC. Multiple linear regression. In: Weiner IB, editor. Handbook of psychology. US p: Wiley; 2003. p. 481–507.

    Chapter  Google Scholar 

  40. Brownlee J. Bagging and random forest ensemble algorithms for machine learning. Mach Learn Alg 2016;4–22.

  41. Quinlan JR. Learning decision tree classifiers. ACM Comput Surv (CSUR). 1996;28(1):71–2.

    Article  Google Scholar 

  42. Dietterich TG. An experimental comparison of three methods for constructing ensembles of decision trees: Bagging, boosting, and randomization. Mach Lear. 2000;40:139–57.

    Article  Google Scholar 

  43. Breiman L. Random forests. Mach Learn. 2001;45:5–32.

    Article  MATH  Google Scholar 

  44. Friedman JH. Greedy function approximation: a gradient boosting machine. Annal Stat. 2001;29:1189–232.

    Article  MathSciNet  MATH  Google Scholar 

  45. Ke G et al. Lightgbm: A highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 2017;30.

  46. Chai T, Draxler RR. Root mean square error (RMSE) or mean absolute error (MAE)?–Arguments against avoiding RMSE in the literature. Geosci Model Dev. 2014;7(3):1247–50.

    Article  Google Scholar 

  47. De Myttenaere A, et al. Mean absolute percentage error for regression models. NeuroComput. 2016;192:38–48.

    Article  Google Scholar 

Download references

Funding

The authors did not receive support from any organization for the submitted work.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tran Tri Dang.

Ethics declarations

Conflict of Interest

The authors declare that they have no conflict of interest.

Ethical Approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Future Data and Security Engineering 2022” guest edited by Tran Khanh Dang.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Dang, T.T., Hoang, K.N., Thanh, L.B. et al. Constructing and Understanding Customer Spending Prediction Models. SN COMPUT. SCI. 4, 852 (2023). https://doi.org/10.1007/s42979-023-02284-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-023-02284-0

Keywords

Navigation