Skip to main content
Log in

Adaptive telecom churn prediction for concept-sensitive imbalance data streams

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

A key toward intelligent decision-making in industries lies in the ability to process and analyze vast quantities of business data. Concept drift and class imbalance are co-existing problems in real-life data sets, and the telecommunication sector is no exception. It has been discovered only recently that the problems of concept drift and class imbalance, which were thought to be totally independent, are correlated in many ways and they have adverse effects on each other. Cumulative Sum Detection Method (CusumDM), a simple and efficient method based on Cumulative Sum Charts, handles the problem of concept drift efficiently. However, the presence of class imbalance causes a decline in its performance. To facilitate intelligent customer churn prediction, this article proposes optimized two-sided Cusum churn detector (OTCCD), which is a significant improvement of CusumDM that handles the problems of class imbalance and concept drift by determining error rate of sliding windows alongside. This approach has been applied to the Call Detail Record (CDR) of a South Asian Telecom Company for churn prediction. Handling telecom data requires high-performance computing power and resource-aware intelligent methods due to the velocity and speed of the data. Presence of an extremely low number of churners creates class imbalance problems in CDR and abruptly, or gradually changing data distribution makes the churn data a rich source to target these problems. Prediction accuracy and mean evaluation time were the basis for verification of classification results, showing that OTCCD outperformed its antecedents CusumDM, Ensemble Drift Detection Method (Ensemble) and Adaptive Windowing (ADWIN) Change Detector by producing better accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13

Similar content being viewed by others

References

  1. Amin A, Al-Obeidat F, Shah B, Al Tae M, Khan C, Ur Rehman Durrani H, Anwar S (2020) Just-in-time customer churn prediction in the telecommunication sector. J Supercomput 76:3924–3948

    Article  Google Scholar 

  2. Park CH (2019) Outlier and anomaly pattern detection on data streams. J Supercomput 75:6118–6128

    Article  Google Scholar 

  3. Sakthithasan S, Pears R, Sing Koh Y (2013) One pass concept change detection for data streams. Adv Knowl Discovery Data Mining 7819:461–472

    MATH  Google Scholar 

  4. Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing, In: Proceedings of the 2007 SIAM International Conference on Data Mining (SDM)

  5. Pears R, Sakthithasan S, Sing Koh Y (2014) Detecting concept change in dynamic data streams. Mach Learn 97:259–293

    Article  MathSciNet  Google Scholar 

  6. Huang DTJ, Koh YS, Dobbie G, Pears R (2014) Detecting volatility shift in data streams, In: IEEE International Conference on Data Mining, Shenzhen, pp 863–868

  7. Sakthithasan S, Pears R, Bifet A, Pfahringer B (2015) Use of ensembles of fourier spectra in capturing recurrent concepts in data streams, In: International Joint Conference on Neural Networks, pp 1–8

  8. Wang S, Minku LL, Yao X (2015) Resampling-based ensemble methods for online class imbalance learning. IEEE Trans Knowl Data Eng 27:1356–1368

    Article  Google Scholar 

  9. Miguéis VL, Camanho AS, Borges J (2016) Predicting direct marketing response in banking: comparison of class imbalance methods. Serv Bus 11:831–849

    Article  Google Scholar 

  10. Lughofer E, Weigl E, Heidl W, Eitzinger C, Radauer T (2016) Recognizing input space and target concept drifts in data streams with scarcely labeled and unlabelled instances. Inf Sci 355–356:127–151

    Article  Google Scholar 

  11. Kithulgoda CI, Pears R (2016) Staged online learning: a new approach to classification in high speed data streams, In: International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, pp 1–8

  12. Youngin KIM, Cheong HP (2017) An efficient concept drift detection method for streaming data under limited labeling. IEICE Trans Inf Syst E100-D:2537–2546

    Article  Google Scholar 

  13. Liu S, Feng L, Jun Wu, Hou G, Han G (2017) Concept drift detection for data stream learning based on angle optimized global embedding and principal component analysis in sensor networks. Comput Electr Eng 58:327–336

    Article  Google Scholar 

  14. Sethi TS, Kantardzic M (2017) On the reliable detection of concept drift from streaming unlabeled data. Exp Syst Appl 82:77–99

    Article  Google Scholar 

  15. Kithulgoda CI, Pears R, Asif Naeem M (2018) The incremental Fourier classifier: leveraging the discrete Fourier transform for classifying high speed data streams. Exp Syst Appl 97:1–17

    Article  Google Scholar 

  16. Demšar J, Bosnić Z (2018) Detecting concept drift in data streams using model explanation. Expert Syst Appl 92:546–559

    Article  Google Scholar 

  17. Escovedo T, Koshiyama A, da Cruz AA (2018) Vellasco M (2018) DetectA: abrupt concept drift detection in non-stationary environments. Appl Soft Comput 62:119–133

    Article  Google Scholar 

  18. Liu A, Jie Lu, Liu F, Zhang G (2018) Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recogn 76:256–272

    Article  Google Scholar 

  19. Baena-García M, Campo-Ávila J, Fidalgo-Merino R, Bifet A, Gavald R, Morales-Bueno R (2006) Early drift detection method, In: Proceedings of the Fourth International Workshop on Knowledge Discovery from Data Streams, pp 77–86.

  20. Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection. In: Advances in artificial intelligence–SBIA 2004. Lecture notes in computer science, vol 3171. Springer, Berlin, Heidelberg

  21. Nishida K, Yamauchi K (2007) Detecting concept drift using statistical testing, In: Yamamoto A, Kida T, Uno T, Kuboyama T (eds) Discovery science. Berlin, pp 264–269. https://www.springer.com/gp/book/9783319677859

  22. Barros RSM, Cabral DRL, Gonçalves PM, Santos SGTC (2017) RDDM: reactive drift detection method. Exp Syst Appl 90:344–355

    Article  Google Scholar 

  23. Ditzler G, Polikar R (2013) Incremental learning of concept drift from streaming imbalanced data. IEEE Trans Knowl Data Eng 25(10):2283–2301

    Article  Google Scholar 

  24. Priya S, Uthra RA (2021) Comprehensive analysis for class imbalance data with concept drift using ensemble based classification. J Ambient Intell Human Comput 12:4943–4956

    Article  Google Scholar 

  25. Mirza B, Lin Z, Liu N (2015) Ensemble of subset online sequential extreme learning machine for class imbalance and concept drift. Neurocomputing 149:316–329

    Article  Google Scholar 

  26. Ren S, Liao Bo, Zhu W, Li Z, Liu W, Li K (2018) The gradual resampling ensemble for mining imbalanced data streams with concept drift. Neurocomputing 286:150–166

    Article  Google Scholar 

  27. Wang S, Minku LL, Yao X (2018) A systematic study of online class imbalance learning with concept drift. IEEE Trans Neural Netw Learn Syst 29(10):4802–4821

    Article  Google Scholar 

  28. Baidari I, Honnikoll N (2021) Bhattacharyya distance based concept drift detection method for evolving data stream. Exp Syst Appl 183:115303

    Article  Google Scholar 

  29. Mehmood H, Kostakos P, Cortes M, Anagnostopoulos T, Pirttikangas S, Gilman E (2021) Concept drift adaptation techniques in distributed environment for real-world data streams. Smart Cities 4(1):349–371

    Article  Google Scholar 

  30. Liu W, Zhang H, Ding Z, Liu Q, Zhu C (2021) A comprehensive active learning method for multiclass imbalanced data streams with concept drift. Knowl -Based Syst 215:106778

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Muhammad Usman.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Toor, A.A., Usman, M. Adaptive telecom churn prediction for concept-sensitive imbalance data streams. J Supercomput 78, 3746–3774 (2022). https://doi.org/10.1007/s11227-021-04021-x

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-021-04021-x

Keywords

Navigation