skip to main content
10.1145/2723372.2742794acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Telco Churn Prediction with Big Data

Published: 27 May 2015 Publication History

Abstract

We show that telco big data can make churn prediction much more easier from the $3$V's perspectives: Volume, Variety, Velocity. Experimental results confirm that the prediction performance has been significantly improved by using a large volume of training data, a large variety of features from both business support systems (BSS) and operations support systems (OSS), and a high velocity of processing new coming data. We have deployed this churn prediction system in one of the biggest mobile operators in China. From millions of active customers, this system can provide a list of prepaid customers who are most likely to churn in the next month, having $0.96$ precision for the top $50000$ predicted churners in the list. Automatic matching retention campaigns with the targeted potential churners significantly boost their recharge rates, leading to a big business value.

References

[1]
A. M. Almana, M. S. Aksoy, and R. Alzahrani. A survey on data mining techniques in customer churn analysis for telecom industry. Journal of Engineering Research and Applications, 4(5):165--171, 2014.
[2]
W. Au, K. Chan, and X. Yao. A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Trans. on Evolutionary Computation, 7(6):532--545, 2003.
[3]
K. binti Oseman, S. binti Mohd Shukor, N. A. Haris, and F. bin Abu Bakar. Data mining in churn analysis model for telecommunication industry. Journal of Statistical Modeling and Analytics, 1:19--27, 2010.
[4]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent Dirichlet allocation. J. Mach. Learn. Res., 3:993--1022, 2003.
[5]
L. Breiman. Random forests. Machine learning, 45(1):5--32, 2001.
[6]
J. Burez and D. Van den Poel. Handling class imbalance in customer churn prediction. Expert Systems with Applications, 36(3):4626--4636, 2009.
[7]
K. Coussement and D. Van den Poel. Churn prediction in subscription services: An application of support vector machines while comparing two parameter-selection techniques. Expert systems with applications, 34(1):313--327, 2008.
[8]
K. Dasgupta, R. Singh, B. Viswanathan, D. Chakraborty, S. Mukherjea, A. A. Nanavati, and A. Joshi. Social ties and their relevance to churn in mobile telecom networks. In EDBT, pages 668--677, 2008.
[9]
P. Datta, B. Masand, D. Mani, and B. Li. Automated cellular modeling and prediction on a large scale. Artificial Intelligence Review, 14(6):485--502, 2000.
[10]
J. Davis and M. Goadrich. The relationship between Precision-Recall and ROC curves. In ICML, pages 233--240, 2006.
[11]
E. J. de Fortuny, D. Martens, and F. Provost. Predictive modeling with big data: Is bigger really better? Big Data, 1(4):215--226, 2013.
[12]
R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. LIBLINEAR: A library for large linear classification. Journal of Machine Learning Research, 9:1871--1874, 2008.
[13]
I. Guyon, V. Lemaire, M. Boullé, G. Dror, and D. Vogel. Analysis of the KDD Cup 2009: Fast scoring on a large orange customer database. 7:1--22, 2009.
[14]
J. Hadden, A. Tiwari, R. Roy, and D. Ruta. Computer assisted customer churn management: State-of-the-art and future trends. Computers & Operations Research, 34(10):2902--2917, 2007.
[15]
H. He and E. Garcia. Learning from imbalanced data. IEEE Trans. on Knowledge and Data Engineering, 21(9):1263--1284, 2009.
[16]
S. Hung, D. Yen, and H. Wang. Applying data mining to telecom churn management. Expert Systems with Applications, 31(3):4515--524, 2006.
[17]
S. Jiang, G. A. Fiore, Y. Yang, J. Ferreira Jr, E. Frazzoli, and M. C. González. A review of urban computing for mobile phone traces: current methods, challenges and opportunities. In KDD Workshop on Urban Computing, pages 2--9, 2013.
[18]
S. Jinbo, L. Xiu, and L. Wenhuang. The application of AdaBoost in customer churn prediction. In International Conference on Service Systems and Service Management, pages 1--6, 2007.
[19]
M. Karnstedt, M. Rowe, J. Chan, H. Alani, and C. Hayes. The effect of user features on churn in social networks. In ACM Web Science Conference, pages 14--17, 2011.
[20]
N. Kim, K.-H. Jung, Y. S. Kim, and J. Lee. Uniformly subsampled ensemble (use) for churn management: Theory and implementation. Expert Systems with Applications, 39(15):11839--11845, 2012.
[21]
A. Lemmens and C. Croux. Bagging and boosting classification trees to predict churn. Journal of Marketing Research, 43(2):276--286, 2006.
[22]
E. Lima, C. Mues, and B. Baesens. Domain knowledge integration in data mining using decision tables: Case studies in churn prediction. Journal of the Operational Research Society, 60(8):1096--1106, 2009.
[23]
N. Lu, H. Lin, J. Lu, and G. Zhang. A customer churn prediction model in telecom industry using boosting. IEEE Trans. on Industrial Informatics, 10(2):1659--1665, 2014.
[24]
S. Neslin, S. Gupta, W. Kamakura, J. Lu, and C. Mason. Detection defection: Measuring and understanding the predictive accuracy of customer churn models. Journal of Marketing Research, 43(2):204--211, 2006.
[25]
M. Owczarczuk. Churn models for prepaid customers in the cellular telecommunication industry using large data marts. Expert Systems with Applications, 37(6):4710--4712, 2010.
[26]
L. Page, S. Brin, R. Motwani, and T. Winograd. The pagerank citation ranking: Bringing order to the web. 1999.
[27]
C. Phua, H. Cao, J. B. Gomes, and M. N. Nguyen. Predicting near-future churners and win-backs in the telecommunications industry. arXiv preprint arXiv:1210.6891, 2012.
[28]
Pushpa and S. Dr.G. An efficient method of building the telecom social network for churn prediction. International Journal of Data Mining & Knowledge Management Process, 2:31--39, 2012.
[29]
D. Radosavljevik, P. van der Putten, and K. K. Larsen. The impact of experimental setup in prepaid churn prediction for mobile telecommunications: What to predict, for whom and does the customer experience matter? Transactions on Machine Learning and Data Mining, 3(2):80--99, 2010.
[30]
S. Rendle. Scaling factorization machines to relational data. In PVLDB, volume 6, pages 337--348, 2013.
[31]
K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In IEEE Symposium on Mass Storage Systems and Technologies (MSST), pages 1--10, 2010.
[32]
W. Verbeke, K. Dejaeger, D. Martens, J. Hur, and B. Baesens. New insights into churn prediction in the telecommunication sector: A profit driven data mining approach. European Journal of Operational Research, 218(1):211--229, 2012.
[33]
W. Verbeke, D. Martens, C. Mues, and B. Baesens. Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert systems with applications, 38:2354--2364, 2011.
[34]
C.-P. Wei and I. Chiu. Turning telecommunications call details to churn prediction: a data mining approach. Expert systems with applications, 23(2):103--112, 2002.
[35]
E. Xevelonakis. Developing retention strategies based on customer profitability in telecommunications: An empirical study. Journal of Database Marketing & Customer Strategy Management, 12:226--242, 2005.
[36]
H.-F. Yu, H.-Y. Lo, H.-P. Hsieh, J.-K. Lou, T. G. McKenzie, J.-W. Chou, P.-H. Chung, C.-H. Ho, C.-F. Chang, Y.-H. Wei, et al. Feature engineering and classifier ensemble for kdd cup 2010. In JMLR W & CP, pages 1--16, 2010.
[37]
M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker, and I. Stoica. Spark: Cluster computing with working sets. In HotCloud, 2010.
[38]
J. Zeng. A topic modeling toolbox using belief propagation. J. Mach. Learn. Res., 13:2233--2236, 2012.
[39]
J. Zeng, W. K. Cheung, and J. Liu. Learning topic models by belief propagation. IEEE Trans. Pattern Anal. Mach. Intell., 35(5):1121--1134, 2013.
[40]
X. Zhu and Z. Ghahramani. Learning from labeled and unlabeled data with label propagation. Technical report, Technical Report Carnegie Mellon University-CALD-02-107, Carnegie Mellon University, 2002.

Cited By

View all
  • (2024)PPHOPCM Privacy-Preserving High-order Possibilistic C-Means Algorithm for Big Data Clustering with Cloud ComputingInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-18608(39-46)Online publication date: 30-May-2024
  • (2024)Bridging Healthcare and Telecommunications A Unified Model for Multi-Task Image Classification2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10658431(1-8)Online publication date: 23-Jul-2024
  • (2024)Characterizing Internet Card User Portraits for Efficient Churn Prediction Model DesignIEEE Transactions on Mobile Computing10.1109/TMC.2023.324120623:2(1735-1752)Online publication date: Feb-2024
  • Show More Cited By

Index Terms

  1. Telco Churn Prediction with Big Data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. big data
    2. customer retention
    3. telco churn prediction

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)86
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)PPHOPCM Privacy-Preserving High-order Possibilistic C-Means Algorithm for Big Data Clustering with Cloud ComputingInternational Journal of Advanced Research in Science, Communication and Technology10.48175/IJARSCT-18608(39-46)Online publication date: 30-May-2024
    • (2024)Bridging Healthcare and Telecommunications A Unified Model for Multi-Task Image Classification2024 11th International Conference on Wireless Networks and Mobile Communications (WINCOM)10.1109/WINCOM62286.2024.10658431(1-8)Online publication date: 23-Jul-2024
    • (2024)Characterizing Internet Card User Portraits for Efficient Churn Prediction Model DesignIEEE Transactions on Mobile Computing10.1109/TMC.2023.324120623:2(1735-1752)Online publication date: Feb-2024
    • (2024)Customer Departure Prognostication: Modern Data Science Methods2024 International Conference on Signal Processing, Computation, Electronics, Power and Telecommunication (IConSCEPT)10.1109/IConSCEPT61884.2024.10627867(1-5)Online publication date: 4-Jul-2024
    • (2024)Advancing Telecom Customer Churn using Deep Learning2024 8th International Conference on Electronics, Communication and Aerospace Technology (ICECA)10.1109/ICECA63461.2024.10801009(1420-1427)Online publication date: 6-Nov-2024
    • (2024)Bank Customer Churn and Extra Benefits Prediction Using Machine Learning Model2024 Second International Conference on Advances in Information Technology (ICAIT)10.1109/ICAIT61638.2024.10690731(1-6)Online publication date: 24-Jul-2024
    • (2024)Churn Prediction and Customer Behavior Analysis in Telecommunications2024 International Conference on Advances in Computing Research on Science Engineering and Technology (ACROSET)10.1109/ACROSET62108.2024.10743509(1-8)Online publication date: 27-Sep-2024
    • (2024)Churn prediction analysis of telecom customers using svm, random forest and logistic regression models using orange data mining toolsE3S Web of Conferences10.1051/e3sconf/202450102012501(02012)Online publication date: 18-Mar-2024
    • (2024)Explaining customer churn prediction in telecom industry using tabular machine learning modelsMachine Learning with Applications10.1016/j.mlwa.2024.10056717(100567)Online publication date: Sep-2024
    • (2023)Research on Application of Stacking Technique in Telecom Churn PredictionHighlights in Science, Engineering and Technology10.54097/hset.v31i.481131(43-52)Online publication date: 10-Feb-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media