A Churn Prediction Dataset from the Telecom Sector: A New Benchmark for Uplift Modeling

Verhelst, Théo; Mercier, Denis; Shestha, Jeevan; Bontempi, Gianluca

doi:10.1007/978-3-031-74640-6_21

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2136))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

7 Accesses

Abstract

Uplift modeling, also known as individual treatment effect (ITE) estimation, is an important approach for data-driven decision making that aims to identify the causal impact of an intervention on individuals. This paper introduces a new benchmark dataset for uplift modeling focused on churn prediction, coming from a telecom company in Belgium. Churn, in this context, refers to customers terminating their subscription to the telecom service. This is the first publicly available dataset offering the possibility to evaluate the efficiency of uplift modeling on the churn prediction problem. Moreover, its unique characteristics make it more challenging than the few other public uplift datasets.

Funded by the Brussels-Capital Region - Innoviris (Brussels Public Organisation for Research and Innovation) under grant number 2019-PHD-16.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Telecom Churn Movement Prediction Using Machine Learning

Telco Customer Churn Analysis: Measuring the Effect of Different Contracts

Predicting Reach to Find Persuadable Customers: Improving Uplift Models for Churn Prevention

Notes

References

Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MATH Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley (1991)
Google Scholar
Dal Pozzolo, A., Caelen, O., Johnson, R.A., Bontempi, G.: Calibrating probability with undersampling for unbalanced classification. In: 2015 IEEE Symposium Series on Computational Intelligence, pp. 159–166. IEEE (2015)
Google Scholar
Diemert Eustache, B.A., Renaudin, C., Massih-Reza, A.: A large scale benchmark for uplift modeling. In: Proceedings of the AdKDD and TargetAd Workshop, KDD, London, United Kingdom, 20 August 2018. ACM (2018)
Google Scholar
Fernández-Loria, C., Provost, F.: Causal classification: treatment effect estimation vs. outcome prediction. J. Mach. Learn. Res. 23(59), 1–35 (2022)
Google Scholar
Fernández-Loria, C., Provost, F.: Causal decision making and causal effect estimation are not the same. . . and why it matters. INFORMS J. Data Sci. (2022)
Google Scholar
Gubela, R.M., Lessmann, S.: Uplift modeling with value-driven evaluation metrics. Decision Support Syst. (2021)
Google Scholar
Guelman, L., Guillén, M., Pérez-Marín, A.M.: Uplift random forests. Cybern. Syst. 46(3–4), 230–248 (2015). https://doi.org/10.1080/01969722.2015.1012892
Article Google Scholar
Gutierrez, P., Gérardy, J.-Y.: Causal inference and uplift modelling: a review of the literature. In: Hardgrove, C., Dorard, L., Thompson, K., Douetteau, F. (eds.) Proceedings of The 3rd International Conference on Predictive Applications and APIs, pp. 1–13. PMLR, Microsoft NERD, Boston, USA (2016)
Google Scholar
Hillstrom, K.: The MineThatData E-mail analytics and data mining challenge (2008). https://blog.minethatdata.com/2008/03/minethatdata-e-mail-analytics-and-data.html
Künzel, S.R., Sekhon, J.S., Bickel, P.J., Yu, B.: Metalearners for estimating heterogeneous treatment effects using machine learning. Proc. Natl. Acad. Sci. U.S.A. 116(10), 4156–4165 (2019). https://doi.org/10.1073/pnas.1804597116
Article ADS PubMed PubMed Central MATH Google Scholar
Li, A., Pearl, J.: Unit selection based on counterfactual logic. In: IJCAI, International Joint Conferences on Artificial Intelligence Organization, pp. 1793–1799 (2019). https://doi.org/10.24963/ijcai.2019/248
Liu, X.-Y., Wu, J., Zhou, Z.-H.: Exploratory undersampling for class-imbalance learning. IEEE Trans. Syst. Man Cybern. Part B (Cybern.) 39(2), 539–550 (2009). https://doi.org/10.1109/tsmcb.2008.2007853
Rößler, J., Schoder, D.: Bridging the gap: a systematic benchmarking of uplift modeling and heterogeneous treatment effects methods. J. Interact. Mark. 57(4), 629–650 (2022)
Article MATH Google Scholar
Verhelst, T., Mercier, D., Shrestha, J., Bontempi, G.: Partial counterfactual identification and uplift modeling: theoretical results and real-world assessment. Mach. Learn. (2023). https://doi.org/10.1007/s10994-023-06317-w
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium
Théo Verhelst & Gianluca Bontempi
Data Science Team, Orange Belgium, Brussels, Belgium
Denis Mercier & Jeevan Shestha

Authors

Théo Verhelst
View author publications
You can also search for this author in PubMed Google Scholar
Denis Mercier
View author publications
You can also search for this author in PubMed Google Scholar
Jeevan Shestha
View author publications
You can also search for this author in PubMed Google Scholar
Gianluca Bontempi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Théo Verhelst .

Editor information

Editors and Affiliations

University of Turin, Turin, Italy
Rosa Meo
Sapienza University of Rome, Rome, Italy
Fabrizio Silvestri

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Verhelst, T., Mercier, D., Shestha, J., Bontempi, G. (2025). A Churn Prediction Dataset from the Telecom Sector: A New Benchmark for Uplift Modeling. In: Meo, R., Silvestri, F. (eds) Machine Learning and Principles and Practice of Knowledge Discovery in Databases. ECML PKDD 2023. Communications in Computer and Information Science, vol 2136. Springer, Cham. https://doi.org/10.1007/978-3-031-74640-6_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-74640-6_21
Published: 01 January 2025
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-74639-0
Online ISBN: 978-3-031-74640-6
eBook Packages: Artificial Intelligence (R0)

Publish with us

Policies and ethics

A Churn Prediction Dataset from the Telecom Sector: A New Benchmark for Uplift Modeling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Telecom Churn Movement Prediction Using Machine Learning

Telco Customer Churn Analysis: Measuring the Effect of Different Contracts

Predicting Reach to Find Persuadable Customers: Improving Uplift Models for Churn Prevention

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Churn Prediction Dataset from the Telecom Sector: A New Benchmark for Uplift Modeling

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Telecom Churn Movement Prediction Using Machine Learning

Telco Customer Churn Analysis: Measuring the Effect of Different Contracts

Predicting Reach to Find Persuadable Customers: Improving Uplift Models for Churn Prevention

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation