Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution

Rogić, Sunčica; Kašćelan, Ljiljana; Kašćelan, Vladimir; Đurišić, Vladimir

doi:10.1007/s10799-021-00353-5

Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution

Published: 18 January 2022

Volume 23, pages 315–333, (2022)
Cite this article

Information Technology and Management Aims and scope Submit manuscript

474 Accesses
2 Citations
Explore all metrics

Abstract

This paper proposes a data mining approach for automatic customer targeting based on their expected profitability. The main challenge with customer profitability prediction is asymmetry, i.e., skewness of the distribution, because the number of highly profitable customers is very small compared to others. Although data mining methods are more resistant to sample heterogeneity than statistical ones, due to strong skewness, the accuracy of predictions often decreases as the value of profit increases. These few customers are actually outliers which can make data-driven methods to overestimate predicted amounts, but on the other hand, they contain very important information about the most valuable customers, so it is not advisable to remove them. In this paper, a data mining approach for overcoming these problems is proposed. The results show that the relative error in predicting the absolute amount of the profitability of the most valuable customers is very small and does not differ much from the error for other customers, unlike previously applied methods where predicting high profitability was less accurate. Accordingly, the specific implication of the high accuracy is more efficient identification of the most profitable customers, which ultimately make a greater contribution to the company in terms of revenue. Also, due to the good precision of the model, errors in the assessment of highly profitable and risky customers are reduced, which leads to savings in unnecessary costs for the marketers.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Intelligent Classification-Based Methods in Customer Profitability Modeling

Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling

Article 11 February 2025

A New Automated Customer Prioritization Method

Data and material availability

Authors are not allowed to share company’s data.

References

Bull C (2003) Strategic issues in customer relationship management (CRM) implementation. Bus Process Manag J 9:592–602
Article Google Scholar
Lee JH, Park SC (2005) Intelligent profitable customers segmentation system based on business intelligence tools. Expert Syst Appl 29:145–152. https://doi.org/10.1016/j.eswa.2005.01.013
Article Google Scholar
Gurău C, Ranchhod A, Hackney R (2003) Customer-centric strategic planning: integrating CRM in online business systems. Inf Technol Manag 4:199–214. https://doi.org/10.1023/A:1022902412594
Article Google Scholar
Verhoef PC, Donkers B (2001) Predicting customer potential value: an application in the insurance industry. Decis Support Syst 32:189–199
Article Google Scholar
Rust RT, Kumar V, Venkatesan R (2011) Will the frog change into a prince? Predicting future customer profitability. Int J Res Mark 28:281–294
Article Google Scholar
Glady N, Baesens B, Croux C (2008) Modeling churn using customer lifetime value. Expert Syst Appl 197:402–411
Google Scholar
Malthouse EC, Blattberg RC (2005) Can we predict customer lifetime value? J Interact Mark 19:2–16. https://doi.org/10.1002/dir.20027
Article Google Scholar
Donkers B, Verhoef PC, de Jong MG (2007) Modeling CLV: a test of competing models in the insurance industry. Quant Mark Econ 5:163–190. https://doi.org/10.1007/s11129-006-9016-y
Article Google Scholar
Xiahou J, Xu Y, Zhang S, Liao W (2016) Customer profitability analysis of automobile insurance market based on data mining. In: ICCSE 2016—1th International Conference on Computer Science & Education pp. 603–609. Doi: https://doi.org/10.1109/ICCSE.2016.7581649
Rogic S, Kascelan L (2020) Class balancing in customer segments classification using support vector machine rule extraction and ensemble learning. Comput Sci Inf Syst 18:893–925. https://doi.org/10.2298/csis200530052r
Article Google Scholar
Fang K, Jiang Y, Song M (2016) Customer profitability forecasting using big data analytics: a case study of the insurance industry. Comput Ind Eng 101:554–564. https://doi.org/10.1016/j.cie.2016.09.011
Article Google Scholar
Lam S (2018) The ensemble of neural network and gradient boosting for the prediction of customer profitability: a two-stage modeling approach. Model Assist Stat Appl 13:329–340. https://doi.org/10.3233/MAS-180443
Article Google Scholar
Vapnik VN (2010) The nature of statistical learning theory. Springer, New York
Google Scholar
Basak D, Pal S, Patranabis DC (2007) Support vector regression. Neural Inf Process Lett Rev 11:203–224
Google Scholar
Lipovina-Božović M, Kašćelan L, Kašćelan V (2019) A support vector machine approach for predicting progress toward environmental sustainability from information and communication technology and human development. Environ Ecol Stat 26:259–286. https://doi.org/10.1007/s10651-019-00427-2
Article Google Scholar
Chuang CC, Su SF, Jeng JT, Hsiao CC (2002) Robust support vector regression networks for function approximation with outliers. IEEE Trans Neural Netw 13:1322–1330. https://doi.org/10.1109/TNN.2002.804227
Article Google Scholar
Colliez J, Dufrenois F, Hamad D (2006) Robust regression and outlier detection with SVR: application to optic flow estimation. In: BMVC 2006— Proc Br Mach Vis Conf 2006. 99: 1229–1238. Doi: https://doi.org/10.5244/c.20.125
Lei M, Jiang G, Yang J, Mei X, Xia P, Shi H (2018) Improvement of the regression model for spindle thermal elongation by a boosting-based outliers detection approach. Int J Adv Manuf Technol 99:1389–1403. https://doi.org/10.1007/s00170-018-2559-8
Article Google Scholar
Wang K, Lan H (2020) Robust support vector data description for novelty detection with contaminated data. Eng Appl Artif Intell 91:103554. https://doi.org/10.1016/j.engappai.2020.103554
Article Google Scholar
Kim D, Lee H, Cho S (2008) Response modeling with support vector regression. Expert Syst Appl 34:1102–1108. https://doi.org/10.1016/j.eswa.2006.12.019
Article Google Scholar
Nalepa J, Kawulok M (2019) Selecting training sets for support vector machines: a review. Artif Intell Rev 52:857–900. https://doi.org/10.1007/s10462-017-9611-1
Article Google Scholar
Guo L, Boukir S (2015) Fast data selection for SVM training using ensemble margin. Pattern Recognit Lett 51:112–119. https://doi.org/10.1016/j.patrec.2014.08.003
Article Google Scholar
Al-Anazi AF, Gates ID (2012) Support vector regression to predict porosity and permeability: effect of sample size. Comput Geosci 39:64–76. https://doi.org/10.1016/j.cageo.2011.06.011
Article Google Scholar
Meng M, Zhao C (2015) Application of support vector machines to a small-sample prediction. Adv Pet Explor Dev 10:72–75. https://doi.org/10.3968/7830
Article Google Scholar
Tange RI, Rasmussen MA, Taira E, Bro R (2017) Benchmarking support vector regression against partial least squares regression and artificial neural network: effect of sample size on model performance. J Near Infrared Spectrosc 25:381–390. https://doi.org/10.1177/0967033517734945
Article Google Scholar
Kašćelan V, Kašćelan L, Burić MN (2016) A nonparametric data mining approach for risk prediction in car insurance: a case study from the Montenegrin market. Econ Res Istraz 29:545–558. https://doi.org/10.1080/1331677X.2016.1175729
Article Google Scholar
Camps-Valls G, Soria-Olivas E, Pérez-Ruixo JJ, Pérez-Cruz F, Figueiras-Vidal AR, Artés-Rodríguez A (2002) Cyclosporine concentration prediction using clustering and support vector regression methods. Electron Lett 38:568–570. https://doi.org/10.1049/el:20020354
Article Google Scholar
Varian HR (2014) Big data: new tricks for econometrics. J Econ Perspect 28:3–28. https://doi.org/10.1257/jep.28.2.3
Article Google Scholar
Sanders R (1987) The pareto principle: its use and abuse. J Serv Mark 1:37–40. https://doi.org/10.1108/eb024706
Article Google Scholar
Qi JY, Zhou YP, Chen WJ, Qu QX (2012) Are customer satisfaction and customer loyalty drivers of customer lifetime value in mobile data services: a comparative cross-country study. Inf Technol Manag 13:281–296. https://doi.org/10.1007/s10799-012-0132-y
Article Google Scholar
Qi JY, Qu QX, Zhou YP, Li L (2014) The impact of users’ characteristics on customer lifetime value raising: evidence from mobile data service in China. Inf Technol Manag 16:273–290. https://doi.org/10.1007/s10799-014-0200-6
Article Google Scholar
Ballestar MT, Grau-Carles P, Sainz J (2019) Predicting customer quality in e-commerce social networks: a machine learning approach. Rev Manag Sci 13:589–603. https://doi.org/10.1007/s11846-018-0316-x
Article Google Scholar
Christmann A (2004) An approach to model complex high? dimensional insurance data. All Stat Arch 88:375–396. https://doi.org/10.1007/s101820400178
Article Google Scholar
D’Haen J, Van Den Poel D, Thorleuchter D (2013) Predicting customer profitability during acquisition: finding the optimal combination of data source and data mining technique. Expert Syst Appl 40:2007–2012. https://doi.org/10.1016/j.eswa.2012.10.023
Article Google Scholar
Ferraretti D, Gamberoni G, Lamma E (2012) Expert systems with applications unsupervised and supervised learning in cascade for petroleum geology. Expert Syst Appl 39:9504–9514. https://doi.org/10.1016/j.eswa.2012.02.104
Article Google Scholar
Berkhin P (2002) Survey of clustering data mining techniques. In: Grouping multidimensional data, pp. 25–71. https://doi.org/10.1007/3-540-28349-8_2
Hughes AM (1994) Strategic database marketing: the masterplan for starting and managing a profitable, customer-based marketing program. Irwin, Chicago
Google Scholar
Cheng CH, Chen YS (2009) Classifying the segmentation of customer value via RFM model and RS theory. Expert Syst Appl 36:4176–4184. https://doi.org/10.1016/j.eswa.2008.04.003
Article Google Scholar
Hosseini SMS, Maleki A, Gholamian MR (2010) Cluster analysis using data mining approach to develop CRM methodology to assess the customer loyalty. Expert Syst Appl 37:5259–5264. https://doi.org/10.1016/j.eswa.2009.12.070
Article Google Scholar
Sarvari P, Ustundag A, Takci H (2016) Performance evaluation of different customer segmentation approaches based on RFM and demographics analysis. Kybernetes 45:1129–1157
Article Google Scholar
Rogic S, Kascelan L (2019) Customer value prediction in direct marketing using hybrid support vector machine rule extraction method. Commun Comput Inf Sci 1064:283–294. https://doi.org/10.1007/978-3-030-30278-8_30
Article Google Scholar
Djurisic V, Kascelan L, Rogic S, Melovic B (2020) Bank CRM optimization using predictive classification based on the support vector machine method. Appl Artif Intell 34:941–955. https://doi.org/10.1080/08839514.2020.1790248
Article Google Scholar
Zeng L, Li L, Duan L (2012) Business intelligence in enterprise computing environment. Inf Technol Manag 13:297–310. https://doi.org/10.1007/s10799-012-0123-z
Article Google Scholar
MacQueen J (1967) Some methods for classification and analysis of multivariate observations. In: Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1: 281–297
Jain AK (2009) Data clustering: 50 years beyond K-means. Pattern Recognit Lett 31:651–666. https://doi.org/10.1016/j.patrec.2009.09.011
Article Google Scholar
Arthur D, Vassilvitskii S (2006) k-means ++ : the advantages of careful seeding. In: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms. pp. 1027–1035
Davies DL, Bouldin DW (1979) A cluster separation measure. In IEEE Transactions on pattern analysis and machine intelligence PAMI-1, pp. 224–227. Doi: https://doi.org/10.1109/TPAMI.1979.4766909
Sanderson M (2010) Christopher D. Manning, Prabhakar Raghavan, Hinrich Schütze, introduction to information retrieval, Cambridge University Press. 2008. Nat Lang Eng 16: 100–103
Raphaeli O, Goldstein A, Fink L (2017) Analyzing online consumer behavior in mobile and PC Devices: a novel web usage mining approach. Electron Commer Res Appl 26:1–12. https://doi.org/10.1016/j.elerap.2017.09.003
Article Google Scholar
Abdi F, Abolmakarem S (2019) Customer behavior mining framework (CBMF) using clustering and classification techniques. J Ind Eng Int. https://doi.org/10.1007/s40092-018-0285-3
Article Google Scholar
Benou P, Vassilakis C, Vrechopoulos A (2012) Context management for m-commerce applications: determinants, methodology and the role of marketing. Inf Technol Manag 13:91–111. https://doi.org/10.1007/s10799-012-0120-2
Article Google Scholar
Bulysheva L, Bulyshev A (2012) Segmentation modeling algorithm: a novel algorithm in data mining. Inf Technol Manag 13:263–271. https://doi.org/10.1007/s10799-012-0136-7
Article Google Scholar
Alizadeh Zoeram A, Karimi Mazidi AR (2018) A new approach for customer clustering by integrating the LRFM model and fuzzy inference system. Iran J Manag Stud 11:351–378. https://doi.org/10.22059/ijms.2018.242528.672839
Article Google Scholar
McCarty JA, Hastak M (2007) Segmentation approaches in data-mining: a comparison of RFM, CHAID, and logistic regression. J Bus Res 60:656–662. https://doi.org/10.1016/j.jbusres.2006.06.015
Article Google Scholar
van Raaij EM, Vernooij MJA, van Triest S (2003) The implementation of customer profitability analysis: a case study. Ind Mark Manag 32:573–583. https://doi.org/10.1016/S0019-8501(03)00006-3
Article Google Scholar
Ben Schafer J, Konstan JA, Riedl J (2001) E-commerce recommendation applications. Data Min Knowl Discov 5:115–153. https://doi.org/10.1007/978-1-4615-1627-9_6
Article Google Scholar
Leick R (2007) Building airline passenger loyalty through an understanding of customer value: a relationship segmentation of airline passengers. PhD thesis, Cranfield University
Rishika R, Kumar A, Janakiraman R, Bezawada R (2013) The effect of customers’ social media participation on customer visit frequency and profitability: an empirical investigation. Inf Syst Res 24:108–127. https://doi.org/10.1287/isre.1120.0460
Article Google Scholar
Sabbeh SF (2018) Machine-learning techniques for customer retention: a comparative study. Int J Adv Comput Sci Appl 9:273–281. https://doi.org/10.14569/IJACSA.2018.090238
Article Google Scholar
Liu DR, Shih YY (2005) Integrating AHP and data mining for product recommendation based on customer lifetime value. Inf Manag 42:387–400. https://doi.org/10.1016/j.im.2004.01.008
Article Google Scholar
Stone MD, Woodcock ND (2014) Interactive, direct and digital marketing: A future that depends on better use of business intelligence. J Res Interact Mark 8:4–17. https://doi.org/10.1108/JRIM-07-2013-0046
Article Google Scholar

Download references

Funding

No funding was obtained for this research.

Author information

Authors and Affiliations

Faculty of Economics, University of Montenegro, Jovana Tomaševića 37, 81000, Podgorica, Montenegro
Sunčica Rogić, Ljiljana Kašćelan, Vladimir Kašćelan & Vladimir Đurišić

Authors

Sunčica Rogić
View author publications
You can also search for this author inPubMed Google Scholar
Ljiljana Kašćelan
View author publications
You can also search for this author inPubMed Google Scholar
Vladimir Kašćelan
View author publications
You can also search for this author inPubMed Google Scholar
Vladimir Đurišić
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Sunčica Rogić.

Ethics declarations

Conflict of interest

The authors declare that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Rogić, S., Kašćelan, L., Kašćelan, V. et al. Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution. Inf Technol Manag 23, 315–333 (2022). https://doi.org/10.1007/s10799-021-00353-5

Download citation

Accepted: 07 December 2021
Published: 18 January 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10799-021-00353-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic customer targeting: a data mining solution to the problem of asymmetric profitability distribution

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Intelligent Classification-Based Methods in Customer Profitability Modeling

Investigating the impact of undersampling and bagging: an empirical investigation for customer attrition modeling

A New Automated Customer Prioritization Method

Data and material availability

References

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now