Abstract
As the market in the telecom industry becomes saturated and competition between telecom operators heats up, preventing customer churn has become a company’s top concern. It is, therefore, crucial to identify customers who are likely to churn and the reasons, as it directly impacts the company’s revenue. The main contribution of this study lies in the multidimensional data preprocessing, feature extraction and processing of the dataset provided by the telecom operator. Then, the k-means algorithm is used to cluster different consumer groups, which in turn analyses the factors of concern to different consumer groups and makes targeted suggestions. Finally, to improve the effectiveness and robustness of the model, ensemble learning is introduced into the telecom customer churn field. The experimental results show that the extracted features and the experimental results are satisfactory. Ensemble learning was also applied to the dataset provided by S. Khotijah and it was found that the churn prediction accuracy rate improved regardless of whether the dataset was balanced, especially in the unbalanced dataset.
Similar content being viewed by others
Data availability
In our experiment, there are mainly two data sets. Dataset one can be obtained by the link https://gitee.com/jian123654/churn_prediction_dataset; Dataset two can be obtained in the connection of references (Khotijah, 2020).
Code availability
Not Applicable
References
Adhikary, D.D., & Gupta, D. (2021). Applying over 100 classifiers for churn prediction in telecom companies. Multimedia Tools and Applications, 80 (28), 35,123–35,144. https://doi.org/10.1007/s11042-020-09658-z.
Ahmad, A.K., Jafar, A., & Aljoumaa, K. (2019). Customer churn prediction in telecom using machine learning in big data platform. Journal of Big Data, 6(1), 1–24. https://doi.org/10.1186/s40537-019-0191-6.
Ahn, J., Hwang, J., Kim, D., & et al. (2020). A survey on churn analysis in various business domains. IEEE Access, 8, 220,816–220,839. https://doi.org/10.1109/ACCESS.2020.3042657.
Castanedo, F., Valverde, G., Zaratiegui, J., & et al. (2014). Using deep learning to predict customer churn in a mobile telecommunication network. Wise Athena LLC, pp. 1–8.
Dalvi, P.K., Khandge, S.K., Deomore, A., & et al. (2016). Analysis of customer churn prediction in telecom industry using decision trees and logistic regression. Figshare https://doi.org/10.1109/CDAN.2016.7570883.
Huang, B., Kechadi, M.T., & Buckley, B. (2012). Customer churn prediction in telecommunications. Expert Systems with Applications, 39(1), 1414–1425. https://doi.org/10.1016/j.eswa.2011.08.024.
Hung, S.Y., Yen, D.C., & Wang, H.Y. (2006). Applying data mining to telecom churn management. Expert Systems with Applications, 31(3), 515–524. https://doi.org/10.1016/j.eswa.2005.09.080.
Idris, A., Rizwan, M., & Khan, A. (2012). Churn prediction in telecom using random forest and pso based data balancing in combination with various feature selection strategies. Computers & Electrical Engineering, 38(6), 1808–1819. https://doi.org/10.1016/j.compeleceng.2012.09.001.
Induja, S., & Eswaramurthy, D. (2016). Customers churn prediction and attribute selection in telecom industry using kernelized extreme learning machine and bat algorithms. International Journal of Science and Research, 5, 258–265.
Khotijah, S. (2020). Churn prediction data sets. Figshare https://www.kaggle.com/code/khotijahs1/churn-prediction.
Lejeune, M. (2001). Measuring the impact of data mining on churn management. Internet Research, 11, 375–387. https://doi.org/10.1108/10662240110410183.
Mitrović, S., Baesens, B., Lemahieu, W., & et al. (2018). On the operational efficiency of different feature types for telco churn prediction. European Journal of Operational Research, 267(3), 1141–1155. https://doi.org/10.1016/j.ejor.2017.12.015.
Praseeda, C., & Shivakumar, B. (2021). Fuzzy particle swarm optimization (fpso) based feature selection and hybrid kernel distance based possibilistic fuzzy local information c-means (hkd-pflicm) clustering for churn prediction in telecom industry. SN Applied Sciences, 3(6), 1–18. https://doi.org/10.1007/s42452-021-04576-7.
Qureshi, S., Rehman, A., Qamar, A., & et al. (2013). Telecommunication subscribers’ churn prediction model using machine learning. Figshare https://doi.org/10.1109/ICDIM.2013.6693977.
Raja, B., & Jeyakumar, P. (2019). An effective classifier for predicting churn in telecommunication. Journal of Advanced Research in Dynamical and Control Systems, 11, 221–229.
Tarnowska, K., Ras, Z.W., & Daniel, L. (2020). Recommender System for Improving Customer Loyalty. Springer. https://doi.org/10.1007/978-3-030-13438-9.
Tarnowska, K.A., & Ras, Z. (2021). NLP-based customer loyalty improvement recommender system (CLIRS2). Big Data and Cognitive Computing, 5 (1), 4. https://doi.org/10.3390/bdcc5010004.
Ullah, I., Raza, B., Malik, A.K., & et al. (2019). A churn prediction model using random forest: analysis of machine learning techniques for churn prediction and factor identification in telecom sector. IEEE Access, 7, 60,134–60,149. https://doi.org/10.1109/ACCESS.2019.2914999.
Vijaya, J., & Sivasankar, E. (2019). An efficient system for customer churn prediction through particle swarm optimization based feature selection model with simulated annealing. Cluster Computing, 22(5), 10,757–10,768. https://doi.org/10.1007/s10586-017-1172-1.
Acknowledgements
This work is supported by three projects: Research on High-precision Positioning Technology for Snow and Ice Emergencies in 5G-based VR Scenarios (No. 20470302D), Research Project on Basic Research Funds for Higher Education Institutions in Hebei Province (No. 2021QNJS12) and Deep Learning Behavioural Recognition Fall Detection Research (No. 2022CXTD04).
Funding
Not Applicable
Author information
Authors and Affiliations
Contributions
Not Applicable
Corresponding author
Ethics declarations
Ethics approval
Not Applicable
Consent for publication
Not Applicable
Consent to participate
Not Applicable
Conflict of interests
The authors have no conflicts of interest to declare that are relevant to the content of this article.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liu, Y., Fan, J., Zhang, J. et al. Research on telecom customer churn prediction based on ensemble learning. J Intell Inf Syst 60, 759–775 (2023). https://doi.org/10.1007/s10844-022-00739-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-022-00739-z