Understanding Telecom Customer Churn with Machine Learning: From Prediction to Causal Inference

Verhelst, Théo; Caelen, Olivier; Dewitte, Jean-Christophe; Lebichot, Bertrand; Bontempi, Gianluca

doi:10.1007/978-3-030-65154-1_11

Understanding Telecom Customer Churn with Machine Learning: From Prediction to Causal Inference

Théo Verhelst¹²,
Olivier Caelen¹³,
Jean-Christophe Dewitte¹³,
Bertrand Lebichot¹² &
…
Gianluca Bontempi¹²

Conference paper
First Online: 05 January 2021

597 Accesses

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1196))

Abstract

Telecommunication companies are evolving in a highly competitive market where attracting new customers is much more expensive than retaining existing ones. Though retention campaigns may be used to prevent customer churn, their success depends on the availability of accurate prediction models. Churn prediction is notoriously a difficult problem because of the large amount of data, non-linearity, imbalance and low separability between the classes of churners and non-churners. In this paper, we discuss a real case of churn prediction based on Orange Belgium customer data. In the first part of the paper we focus on the design of an accurate prediction model. The large class imbalance between the two classes is handled with the EasyEnsemble algorithm using a random forest classifier. We assess also the impact of different data preprocessing techniques including feature selection and engineering. Results show that feature selection can be used to reduce computation time and memory requirements, though engineering variables does not necessarily improve performance. In the second part of the paper we explore the application of data-driven causal inference, which aims to infer causal relationships between variables from observational data. We conclude that the bill shock and the wrong tariff plan positioning are putative causes of churn. This is supported by the prior knowledge of experts at Orange Belgium. Finally, we present a novel method to evaluate, in terms of the direction and magnitude, the impact of causally relevant variables on churn, making the assumption of no confounding factors.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
SIM-only indicates that the customer bought no other product than the SIM card.
2.
For confidentiality reasons, the precise value of the churn rate cannot be disclosed.
3.
For confidentiality reasons, the axes scales are concealed.

References

Bontempi, G., Flauder, M.: From dependency to causality: a machine learning approach. J. Mach. Learn. Res. 16(1), 2437–2457 (2015)
MathSciNet MATH Google Scholar
Bontempi, G., Meyer, P.E.: Causal filter selection in microarray data. In: Proceedings of the 27th International Conference on Machine Learning (icml-10), pp. 95–102 (2010)
Google Scholar
Dal Pozzolo, A., Bontempi, G.: Adaptive machine learning for credit card fraud detection (2015)
Google Scholar
Dal Pozzolo, A., Caelen, O., Waterschoot, S., Bontempi, G.: Racing for unbalanced methods selection. In: Yin, H., et al. (eds.) IDEAL 2013. LNCS, vol. 8206, pp. 24–31. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41278-3_4
Chapter Google Scholar
De Caigny, A., Coussement, K., De Bock, K.W.: A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees. Eur. J. Oper. Res. 269(2), 760–772 (2018). https://doi.org/10.1016/j.ejor.2018.02.009
Article MathSciNet MATH Google Scholar
Elazmeh, W., Japkowicz, N., Matwin, S.: Evaluating misclassifications in imbalanced data. In: Fürnkranz, J., Scheffer, T., Spiliopoulou, M. (eds.) ECML 2006. LNCS (LNAI), vol. 4212, pp. 126–137. Springer, Heidelberg (2006). https://doi.org/10.1007/11871842_16
Chapter Google Scholar
Fisher, R.A.: The Design of Experiments. Oliver and Boyd, Edinburgh, London (1937)
MATH Google Scholar
Good, P.: Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses. Springer, New York (2013). https://doi.org/10.1007/978-1-4757-2346-5
Book MATH Google Scholar
Gu, Q., Zhu, L., Cai, Z.: Evaluation measures of the classification performance of imbalanced data sets. In: Cai, Z., Li, Z., Kang, Z., Liu, Y. (eds.) ISICA 2009. CCIS, vol. 51, pp. 461–471. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04962-0_53
Chapter Google Scholar
Gutierrez, P., Gérardy, J.Y.: Causal inference and uplift modelling: a review of the literature. In: International Conference on Predictive Applications and APIs, pp. 1–13 (2017)
Google Scholar
Hadden, J., Tiwari, A., Roy, R., Ruta, D.: Computer assisted customer churn management: state-of-the-art and future trends. Comput. Oper. Res. 34(10), 2902–2917 (2007)
Article Google Scholar
Idris, A., Khan, A.: Ensemble based efficient churn prediction model for telecom. In: 2014 12th International Conference on Frontiers of Information Technology (FIT), pp. 238–244 (2014). https://doi.org/10.1109/fit.2014.52
ITU: ITU releases 2018 global and regional ICT estimates (2018). https://www.itu.int/en/ITU-D/Statistics/Pages/stat/
Krieger, N., Davey Smith, G.: The tale wagged by the dag: broadening the scope of causal inference and explanation for epidemiology. Int. J. Epidemiol. 45(6), 1787–1808 (2016)
Google Scholar
Lemeire, J., Meganck, S., Cartella, F., Liu, T.: Conservative independence-based causal structure learning in absence of adjacency faithfulness. Int. J. Approx. Reason. 53(9), 1305–1325 (2012)
Article MathSciNet Google Scholar
Liu, X.Y., Wu, J., Zhou, Z.H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B Cybern. 39(2), 539–550 (2009). https://doi.org/10.1109/tsmcb.2008.2007853
Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. In: Advances in Neural Information Processing Systems, pp. 505–511 (2000)
Google Scholar
Mitrović, S., Baesens, B., Lemahieu, W., De Weerdt, J.: On the operational efficiency of different feature types for telco Churn prediction. Eur. J. Oper. Res. 267(3), 1141–1155 (2018). https://doi.org/10.1016/j.ejor.2017.12.015
Article Google Scholar
Olsen, C., Meyer, P.E., Bontempi, G.: On the impact of entropy estimation on transcriptional regulatory network inference based on mutual information. EURASIP J. Bioinform. Syst. Biol. 2009(1), 308959 (2008)
Google Scholar
Pearl, J.: Causality: models, reasoning, and inference. IIE Trans. 34(6), 583–589 (2002)
Google Scholar
Petersen, M.L., Sinisi, S.E., van der Laan, M.J.: Estimation of direct causal effects. In: Epidemiology, pp. 276–284 (2006)
Google Scholar
Raeder, T., Forman, G., Chawla, N.V.: Learning from imbalanced data: evaluation matters. In: Holmes, D.E., Jain, L.C. (eds.) Data Mining: Foundations and Intelligent Paradigms, pp. 315–331. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-23166-7_12
Chapter MATH Google Scholar
Scutari, M.: Learning Bayesian networks with the bnlearn R package. arXiv preprint arXiv:0908.3817 (2009)
Spirtes, P., Glymour, C.: An algorithm for fast recovery of sparse causal graphs. Soc. Sci. Comput. Rev. 9(1), 62–72 (1991)
Article Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, vol. 81. Springer, New York (1993). https://doi.org/10.1007/978-1-4612-2748-9
Book MATH Google Scholar
Tsamardinos, I., Aliferis, C.F., Statnikov, A.R., Statnikov, E.: Algorithms for large scale markov blanket discovery. In: FLAIRS Conference, vol. 2, pp. 376–380 (2003)
Google Scholar
Verbeke, W., Dejaeger, K., Martens, D., Hur, J., Baesens, B.: New insights into churn prediction in the telecommunication sector: a profit driven data mining approach. Eur. J. Oper. Res. 218(1), 211–229 (2012)
Article Google Scholar
Verbeke, W., Martens, D., Baesens, B.: Social network analysis for customer churn prediction. Appl. Soft Comput. 14, 431–446 (2014). https://doi.org/10.1016/j.asoc.2013.09.017
Article Google Scholar
Zhu, B., Baesens, B., vanden Broucke, S.K., : An empirical comparison of techniques for the class imbalance problem in churn prediction. Inf. Sci. 408, 84–99 (2017). https://doi.org/10.1016/j.ins.2017.04.015
Óskarsdóttir, M., Bravo, C., Verbeke, W., Sarraute, C., Baesens, B., Vanthienen, J.: Social network analytics for churn prediction in telco: model building, evaluation and network architecture. Expert Syst. Appl. 85, 204–220 (2017). https://doi.org/10.1016/j.eswa.2017.05.028
Article Google Scholar
Óskarsdóttir, M., Van Calster, T., Baesens, B., Lemahieu, W., Vanthienen, J.: Time series for early churn detection: Using similarity based classification for dynamic networks. Expert Syst. Appl. 106, 55–65 (2018). https://doi.org/10.1016/j.eswa.2018.04.003
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium
Théo Verhelst, Bertrand Lebichot & Gianluca Bontempi
Data Science Team, Orange Belgium, Brussels, Belgium
Olivier Caelen & Jean-Christophe Dewitte

Authors

Théo Verhelst
View author publications
You can also search for this author in PubMed Google Scholar
Olivier Caelen
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Christophe Dewitte
View author publications
You can also search for this author in PubMed Google Scholar
Bertrand Lebichot
View author publications
You can also search for this author in PubMed Google Scholar
Gianluca Bontempi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Théo Verhelst .

Editor information

Editors and Affiliations

Vrije Universiteit Brussel, Brussels, Belgium
Bart Bogaerts
Université Libre de Bruxelles, Brussels, Belgium
Gianluca Bontempi
Université de Liège, Liège, Belgium
Pierre Geurts
Vrije Universiteit Brussel, Brussels, Belgium
Nick Harley
Université Libre de Bruxelles, Brussels, Belgium
Bertrand Lebichot
Université Libre de Bruxelles, Brussels, Belgium
Tom Lenaerts
Université de Liège, Liège, Belgium
Gilles Louppe

Additional Figures on Sensitivity Analysis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verhelst, T., Caelen, O., Dewitte, JC., Lebichot, B., Bontempi, G. (2020). Understanding Telecom Customer Churn with Machine Learning: From Prediction to Causal Inference. In: Bogaerts, B., et al. Artificial Intelligence and Machine Learning. BNAIC BENELEARN 2019 2019. Communications in Computer and Information Science, vol 1196. Springer, Cham. https://doi.org/10.1007/978-3-030-65154-1_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-65154-1_11
Published: 05 January 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-65153-4
Online ISBN: 978-3-030-65154-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Buying options

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Additional Figures on Sensitivity Analysis

Additional Figures on Sensitivity Analysis

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation