Skip to main content
Log in

Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution

  • Published:
Information Technology and Management Aims and scope Submit manuscript

Abstract

This paper proposes a data mining approach for automatic customer targeting based on their expected profitability. The main challenge with customer profitability prediction is asymmetry, i.e., skewness of the distribution, because the number of highly profitable customers is very small compared to others. Although data mining methods are more resistant to sample heterogeneity than statistical ones, due to strong skewness, the accuracy of predictions often decreases as the value of profit increases. These few customers are actually outliers which can make data-driven methods to overestimate predicted amounts, but on the other hand, they contain very important information about the most valuable customers, so it is not advisable to remove them. In this paper, a data mining approach for overcoming these problems is proposed. The results show that the relative error in predicting the absolute amount of the profitability of the most valuable customers is very small and does not differ much from the error for other customers, unlike previously applied methods where predicting high profitability was less accurate. Accordingly, the specific implication of the high accuracy is more efficient identification of the most profitable customers, which ultimately make a greater contribution to the company in terms of revenue. Also, due to the good precision of the model, errors in the assessment of highly profitable and risky customers are reduced, which leads to savings in unnecessary costs for the marketers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

Data and material availability

Authors are not allowed to share company’s data.

References

  1. Bull C (2003) Strategic issues in customer relationship management (CRM) implementation. Bus Process Manag J 9:592–602

    Article  Google Scholar 

  2. Lee JH, Park SC (2005) Intelligent profitable customers segmentation system based on business intelligence tools. Expert Syst Appl 29:145–152. https://doi.org/10.1016/j.eswa.2005.01.013

    Article  Google Scholar 

  3. Gurău C, Ranchhod A, Hackney R (2003) Customer-centric strategic planning: integrating CRM in online business systems. Inf Technol Manag 4:199–214. https://doi.org/10.1023/A:1022902412594

    Article  Google Scholar 

  4. Verhoef PC, Donkers B (2001) Predicting customer potential value: an application in the insurance industry. Decis Support Syst 32:189–199

    Article  Google Scholar 

  5. Rust RT, Kumar V, Venkatesan R (2011) Will the frog change into a prince? Predicting future customer profitability. Int J Res Mark 28:281–294

    Article  Google Scholar 

  6. Glady N, Baesens B, Croux C (2008) Modeling churn using customer lifetime value. Expert Syst Appl 197:402–411

    Google Scholar 

  7. Malthouse EC, Blattberg RC (2005) Can we predict customer lifetime value? J Interact Mark 19:2–16. https://doi.org/10.1002/dir.20027

    Article  Google Scholar 

  8. Donkers B, Verhoef PC, de Jong MG (2007) Modeling CLV: a test of competing models in the insurance industry. Quant Mark Econ 5:163–190. https://doi.org/10.1007/s11129-006-9016-y

    Article  Google Scholar 

  9. Xiahou J, Xu Y, Zhang S, Liao W (2016) Customer profitability analysis of automobile insurance market based on data mining. In: ICCSE 2016—1th International Conference on Computer Science & Education pp. 603–609. Doi: https://doi.org/10.1109/ICCSE.2016.7581649

  10. Rogic S, Kascelan L (2020) Class balancing in customer segments classification using support vector machine rule extraction and ensemble learning. Comput Sci Inf Syst 18:893–925. https://doi.org/10.2298/csis200530052r

    Article  Google Scholar 

  11. Fang K, Jiang Y, Song M (2016) Customer profitability forecasting using big data analytics: a case study of the insurance industry. Comput Ind Eng 101:554–564. https://doi.org/10.1016/j.cie.2016.09.011

    Article  Google Scholar 

  12. Lam S (2018) The ensemble of neural network and gradient boosting for the prediction of customer profitability: a two-stage modeling approach. Model Assist Stat Appl 13:329–340. https://doi.org/10.3233/MAS-180443

    Article  Google Scholar 

  13. Vapnik VN (2010) The nature of statistical learning theory. Springer, New York

    Google Scholar 

  14. Basak D, Pal S, Patranabis DC (2007) Support vector regression. Neural Inf Process Lett Rev 11:203–224

    Google Scholar 

  15. Lipovina-Božović M, Kašćelan L, Kašćelan V (2019) A support vector machine approach for predicting progress toward environmental sustainability from information and communication technology and human development. Environ Ecol Stat 26:259–286. https://doi.org/10.1007/s10651-019-00427-2

    Article  Google Scholar 

  16. Chuang CC, Su SF, Jeng JT, Hsiao CC (2002) Robust support vector regression networks for function approximation with outliers. IEEE Trans Neural Netw 13:1322–1330. https://doi.org/10.1109/TNN.2002.804227

    Article  Google Scholar 

  17. Colliez J, Dufrenois F, Hamad D (2006) Robust regression and outlier detection with SVR: application to optic flow estimation. In: BMVC 2006— Proc Br Mach Vis Conf 2006. 99: 1229–1238. Doi: https://doi.org/10.5244/c.20.125

  18. Lei M, Jiang G, Yang J, Mei X, Xia P, Shi H (2018) Improvement of the regression model for spindle thermal elongation by a boosting-based outliers detection approach. Int J Adv Manuf Technol 99:1389–1403. https://doi.org/10.1007/s00170-018-2559-8

    Article  Google Scholar 

  19. Wang K, Lan H (2020) Robust support vector data description for novelty detection with contaminated data. Eng Appl Artif Intell 91:103554. https://doi.org/10.1016/j.engappai.2020.103554

    Article  Google Scholar 

  20. Kim D, Lee H, Cho S (2008) Response modeling with support vector regression. Expert Syst Appl 34:1102–1108. https://doi.org/10.1016/j.eswa.2006.12.019

    Article  Google Scholar 

  21. Nalepa J, Kawulok M (2019) Selecting training sets for support vector machines: a review. Artif Intell Rev 52:857–900. https://doi.org/10.1007/s10462-017-9611-1

    Article  Google Scholar 

  22. Guo L, Boukir S (2015) Fast data selection for SVM training using ensemble margin. Pattern Recognit Lett 51:112–119. https://doi.org/10.1016/j.patrec.2014.08.003

    Article  Google Scholar 

  23. Al-Anazi AF, Gates ID (2012) Support vector regression to predict porosity and permeability: effect of sample size. Comput Geosci 39:64–76. https://doi.org/10.1016/j.cageo.2011.06.011

    Article  Google Scholar 

  24. Meng M, Zhao C (2015) Application of support vector machines to a small-sample prediction. Adv Pet Explor Dev 10:72–75. https://doi.org/10.3968/7830

    Article  Google Scholar 

  25. Tange RI, Rasmussen MA, Taira E, Bro R (2017) Benchmarking support vector regression against partial least squares regression and artificial neural network: effect of sample size on model performance. J Near Infrared Spectrosc 25:381–390. https://doi.org/10.1177/0967033517734945

    Article  Google Scholar 

  26. Kašćelan V, Kašćelan L, Burić MN (2016) A nonparametric data mining approach for risk prediction in car insurance: a case study from the Montenegrin market. Econ Res Istraz 29:545–558. https://doi.org/10.1080/1331677X.2016.1175729

    Article  Google Scholar 

  27. Camps-Valls G, Soria-Olivas E, Pérez-Ruixo JJ, Pérez-Cruz F, Figueiras-Vidal AR, Artés-Rodríguez A (2002) Cyclosporine concentration prediction using clustering and support vector regression methods. Electron Lett 38:568–570. https://doi.org/10.1049/el:20020354

    Article  Google Scholar 

  28. Varian HR (2014) Big data: new tricks for econometrics. J Econ Perspect 28:3–28. https://doi.org/10.1257/jep.28.2.3

    Article  Google Scholar 

  29. Sanders R (1987) The pareto principle: its use and abuse. J Serv Mark 1:37–40. https://doi.org/10.1108/eb024706

    Article  Google Scholar 

  30. Qi JY, Zhou YP, Chen WJ, Qu QX (2012) Are customer satisfaction and customer loyalty drivers of customer lifetime value in mobile data services: a comparative cross-country study. Inf Technol Manag 13:281–296. https://doi.org/10.1007/s10799-012-0132-y

    Article  Google Scholar 

  31. Qi JY, Qu QX, Zhou YP, Li L (2014) The impact of users’ characteristics on customer lifetime value raising: evidence from mobile data service in China. Inf Technol Manag 16:273–290. https://doi.org/10.1007/s10799-014-0200-6

    Article  Google Scholar 

  32. Ballestar MT, Grau-Carles P, Sainz J (2019) Predicting customer quality in e-commerce social networks: a machine learning approach. Rev Manag Sci 13:589–603. https://doi.org/10.1007/s11846-018-0316-x

    Article  Google Scholar 

  33. Christmann A (2004) An approach to model complex high? dimensional insurance data. All Stat Arch 88:375–396. https://doi.org/10.1007/s101820400178

    Article  Google Scholar 

  34. D’Haen J, Van Den Poel D, Thorleuchter D (2013) Predicting customer profitability during acquisition: finding the optimal combination of data source and data mining technique. Expert Syst Appl 40:2007–2012. https://doi.org/10.1016/j.eswa.2012.10.023

    Article  Google Scholar 

  35. Ferraretti D, Gamberoni G, Lamma E (2012) Expert systems with applications unsupervised and supervised learning in cascade for petroleum geology. Expert Syst Appl 39:9504–9514. https://doi.org/10.1016/j.eswa.2012.02.104

    Article  Google Scholar 

  36. Berkhin P (2002) Survey of clustering data mining techniques. In: Grouping multidimensional data, pp. 25–71. https://doi.org/10.1007/3-540-28349-8_2

  37. Hughes AM (1994) Strategic database marketing: the masterplan for starting and managing a profitable, customer-based marketing program. Irwin, Chicago

    Google Scholar 

  38. Cheng CH, Chen YS (2009) Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst Appl 36:4176–4184. https://doi.org/10.1016/j.eswa.2008.04.003

    Article  Google Scholar 

  39. Hosseini SMS, Maleki A, Gholamian MR (2010) Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Syst Appl 37:5259–5264. https://doi.org/10.1016/j.eswa.2009.12.070

    Article  Google Scholar 

  40. Sarvari P, Ustundag A, Takci H (2016) Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes 45:1129–1157

    Article  Google Scholar 

  41. Rogic S, Kascelan L (2019) Customer value prediction in direct marketing using hybrid support vector machine rule extraction method. Commun Comput Inf Sci 1064:283–294. https://doi.org/10.1007/978-3-030-30278-8_30

    Article  Google Scholar 

  42. Djurisic V, Kascelan L, Rogic S, Melovic B (2020) Bank CRM optimization using predictive classification based on the support vector machine method. Appl Artif Intell 34:941–955. https://doi.org/10.1080/08839514.2020.1790248

    Article  Google Scholar 

  43. Zeng L, Li L, Duan L (2012) Business intelligence in enterprise computing environment. Inf Technol Manag 13:297–310. https://doi.org/10.1007/s10799-012-0123-z

    Article  Google Scholar 

  44. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1: 281–297

  45. Jain AK (2009) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31:651–666. https://doi.org/10.1016/j.patrec.2009.09.011

    Article  Google Scholar 

  46. Arthur D, Vassilvitskii S (2006) k-means ++ : the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. pp. 1027–1035

  47. Davies DL, Bouldin DW (1979) A cluster separation measure. In IEEE Transactions on pattern analysis and machine intelligence PAMI-1, pp. 224–227. Doi: https://doi.org/10.1109/TPAMI.1979.4766909

  48. Sanderson M (2010) Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, introduction to information retrieval, Cambridge University Press. 2008. Nat Lang Eng 16: 100–103

  49. Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and PC Devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12. https://doi.org/10.1016/j.elerap.2017.09.003

    Article  Google Scholar 

  50. Abdi F, Abolmakarem S (2019) Customer behavior mining framework (CBMF) using clustering and classification techniques. J Ind Eng Int. https://doi.org/10.1007/s40092-018-0285-3

    Article  Google Scholar 

  51. Benou P, Vassilakis C, Vrechopoulos A (2012) Context management for m-commerce applications: determinants, methodology and the role of marketing. Inf Technol Manag 13:91–111. https://doi.org/10.1007/s10799-012-0120-2

    Article  Google Scholar 

  52. Bulysheva L, Bulyshev A (2012) Segmentation modeling algorithm: a novel algorithm in data mining. Inf Technol Manag 13:263–271. https://doi.org/10.1007/s10799-012-0136-7

    Article  Google Scholar 

  53. Alizadeh Zoeram A, Karimi Mazidi AR (2018) A new approach for customer clustering by integrating the LRFM model and fuzzy inference system. Iran J Manag Stud 11:351–378. https://doi.org/10.22059/ijms.2018.242528.672839

    Article  Google Scholar 

  54. McCarty JA, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60:656–662. https://doi.org/10.1016/j.jbusres.2006.06.015

    Article  Google Scholar 

  55. van Raaij EM, Vernooij MJA, van Triest S (2003) The implementation of customer profitability analysis: a case study. Ind Mark Manag 32:573–583. https://doi.org/10.1016/S0019-8501(03)00006-3

    Article  Google Scholar 

  56. Ben Schafer J, Konstan JA, Riedl J (2001) E-commerce recommendation applications. Data Min Knowl Discov 5:115–153. https://doi.org/10.1007/978-1-4615-1627-9_6

    Article  Google Scholar 

  57. Leick R (2007) Building airline passenger loyalty through an understanding of customer value: a relationship segmentation of airline passengers. PhD thesis, Cranfield University

  58. Rishika R, Kumar A, Janakiraman R, Bezawada R (2013) The effect of customers’ social media participation on customer visit frequency and profitability: an empirical investigation. Inf Syst Res 24:108–127. https://doi.org/10.1287/isre.1120.0460

    Article  Google Scholar 

  59. Sabbeh SF (2018) Machine-learning techniques for customer retention: a comparative study. Int J Adv Comput Sci Appl 9:273–281. https://doi.org/10.14569/IJACSA.2018.090238

    Article  Google Scholar 

  60. Liu DR, Shih YY (2005) Integrating AHP and data mining for product recommendation based on customer lifetime value. Inf Manag 42:387–400. https://doi.org/10.1016/j.im.2004.01.008

    Article  Google Scholar 

  61. Stone MD, Woodcock ND (2014) Interactive, direct and digital marketing: A future that depends on better use of business intelligence. J Res Interact Mark 8:4–17. https://doi.org/10.1108/JRIM-07-2013-0046

    Article  Google Scholar 

Download references

Funding

No funding was obtained for this research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunčica Rogić.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Rogić, S., Kašćelan, L., Kašćelan, V. et al. Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution. Inf Technol Manag 23, 315–333 (2022). https://doi.org/10.1007/s10799-021-00353-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10799-021-00353-5

Keywords

Navigation