Feature-selection-based dynamic transfer ensemble model for customer churn prediction

Xiao, Jin; Xiao, Yi; Huang, Anqiang; Liu, Dunhu; Wang, Shouyang

doi:10.1007/s10115-013-0722-y

Feature-selection-based dynamic transfer ensemble model for customer churn prediction

Regular Paper
Published: 16 January 2014

Volume 43, pages 29–51, (2015)
Cite this article

Knowledge and Information Systems Aims and scope Submit manuscript

Jin Xiao¹,
Yi Xiao²,
Anqiang Huang³,
Dunhu Liu⁴ &
…
Shouyang Wang⁵

1549 Accesses
40 Citations
Explore all metrics

Abstract

Customer churn prediction is one of the key steps to maximize the value of customers for an enterprise. It is difficult to get satisfactory prediction effect by traditional models constructed on the assumption that the training and test data are subject to the same distribution, because the customers usually come from different districts and may be subject to different distributions in reality. This study proposes a feature-selection-based dynamic transfer ensemble (FSDTE) model that aims to introduce transfer learning theory for utilizing the customer data in both the target and related source domains. The model mainly conducts a two-layer feature selection. In the first layer, an initial feature subset is selected by GMDH-type neural network only in the target domain. In the second layer, several appropriate patterns from the source domain to target training set are selected, and some features with higher mutual information between them and the class variable are combined with the initial subset to construct a new feature subset. The selection in the second layer is repeated several times to generate a series of new feature subsets, and then, we train a base classifier in each one. Finally, a best base classifier is selected dynamically for each test pattern. The experimental results in two customer churn prediction datasets show that FSDTE can achieve better performance compared with the traditional churn prediction strategies, as well as three existing transfer learning strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on telecom customer churn prediction based on ensemble learning

Article 14 September 2022

One-Step Classifier Ensemble Model for Customer Churn Prediction with Imbalanced Class

A New Hybrid Feature Selection-Classification Method to Identify Churned Customers

References

Dyché J (2001) The CRM handbook: a business guide to customer relationship management. Addison-Wesley, Reading
Google Scholar
Bhattacharya CB (1998) When customers are members: customer retention in paid membership contexts. J Acad Market Sci 26(1):31–44
Article Google Scholar
Neslin SA, Gupta S, Kamakura W, Lu JX, Mason CH (2006) Detection defection: measuring and understanding the predictive accuracy of customer churn models. J Market Res 43(2):204–211
Article Google Scholar
Au W, Chan KCC, Yao X (2004) A novel evolutionary data mining algorithm with applications to churn prediction. IEEE T Evol Comput 7(6):532–545
Google Scholar
Kisioglu P, Topcu YI (2011) Applying Bayesian belief network approach to customer churn analysis: a case study on the telecom industry of Turkey. Expert Syst Appl 38(6):7151–7157
Article Google Scholar
Pendharkar PC (2005) A threshold-varying artificial neural network approach for classification and its application to bankruptcy prediction problem. Comput Oper Res 32(10):2561–2582
Article MATH Google Scholar
Wei CP, Chiu IT (2002) Turning telecommunications call details to churn prediction: a data mining approach. Expert Syst Appl 23(2):103–112
Article Google Scholar
Zhao Y, Li B, Li X, Liu W, Ren S (2005) Customer churn prediction using improved one-class support vector machine. In: Li X, Wang S, Dong ZY (eds) ADMA 2005, LNAI 3584. Springer, Berlin, pp 300–306
Google Scholar
Wang BX, Japkowicz N (2010) Boosting support vector machines for imbalanced data sets. Knowl Inf Syst 25(1):1–20
Article Google Scholar
Yang Q, Wu X (2006) 10 challenging problems in data mining research. Int J Inf Tech Decis 5(4):597–604
Article Google Scholar
Verbeke W, Martens D, Mues C, Baesens B (2011) Building comprehensible customer churn prediction models with advanced rule induction techniques. Expert Syst Appl 38(3):2354–2364
Article Google Scholar
Xia G, Jin W (2008) Model of customer churn prediction on support vector machine. Syst Eng Theor Pract 28(1):71–77
Article Google Scholar
Lemmens A, Croux C (2006) Bagging and boosting classification trees to predict churn. J Market Res 43(2):276–286
Article Google Scholar
Glady N, Baesens B, Croux C (2009) Modeling churn using customer lifetime value. Eur J Oper Res 197(1):402–411
Article MATH Google Scholar
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Pan SJ, Yang Q (2010) A survey on transfer learning. IEEE T Knowl Data En 22(10):1345–1359
Article Google Scholar
Kittler J, Hatef M, Duin RPW, Matas J (1998) On combining classifiers. IEEE T Pattern Anal 20(3):226–239
Article Google Scholar
Amanifard N, Nariman-Zadeh N, Borji M, Khalkhali A, Habibdoust A (2008) Modelling and Pareto optimization of heat transfer and flow coefficients in microchannels using GMDH type neural networks and genetic algorithms. Energ Convers Manag 49(2):311–325
Article Google Scholar
Ivakhnenko AG (1976) The group method of data handling in prediction problems. Soviet Autom Contr 9(6):21–30
MathSciNet Google Scholar
Ranawana R, Palade V (2006) Multi-classifier systems: review and a roadmap for developers. Int J Hybr Intell Syst 3(1):35–61
MATH Google Scholar
Hansen LK, Salamon P (1990) Neural network ensembles. IEEE T Pattern Anal 12(10):993–1001
Article Google Scholar
Woods K, Kegelmeyer WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimates. IEEE T Pattern Anal 19(4):405–410
Article Google Scholar
Kuncheva L, Whitaker C (2003) Measures of diversity in classifier ensembles and their relationship with the ensemble accuracy. Mach Learn 51(2):181–207
Article MATH Google Scholar
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH MathSciNet Google Scholar
Ho TK (1998) The random space method for constructing decision forests. IEEE T Pattern Anal 20(8):832–844
Article Google Scholar
Zhu X, Wu X, Yang Y (2006) Effective classification of noisy data streams with attribute-oriented dynamic classifier selection. Knowl Inf Syst 9(3):339–363
Article MathSciNet Google Scholar
Ko AHR, Sabourin R, Britto AS Jr (2008) From dynamic classifier selection to dynamic ensemble selection. Pattern Recogn 41(5):1718–1731
Google Scholar
Bi W, Shi Y, Lan Z (2009) Transferred feature selection. In: Proceedings of IEEE international conference on data mining workshops, pp 416–421
Kamishima T, Hamasaki M, Akaho S (2009) TrBagg: a simple transfer learning method and its application to personalization in collaborative tagging. In: Proceedings of ninth IEEE international conference on data mining, Miami, FL, USA, pp 219–228
Dai W, Yang Q, Xue GR, Yu Y (2007) Boosting for transfer learning. In: Proceedings of the 24th international conference on machine learning, pp 193–200
Freund Y, Schapire RE (1997) A decision-theoretic generalization of on-line learning and an application to boosting. J Comput Syst Sci 55(1):119–139
Article MATH MathSciNet Google Scholar
Mueller JA, Lemke F (2000) Self-organising data mining: an intelligent approach to extract knowledge from data. Libri
Abdel-Aal RE, Elhadidy MA, Shaahid SM (2008) Modeling and forecasting the mean hourly wind speed time series using GMDH-based abductive networks. Renew Energ 34(7):1686–1699
Article Google Scholar
Puig V, Witczak M, Nejjari F, Quevedo J, Korbicz J (2007) A GMDH neural network-based approach to passive robust fault detection using a constraint satisfaction backward test. Eng Appl Artif Intell 20:886–897
Article Google Scholar
Xiao J, He CZ, Jiang XY, Liu DH (2010) A dynamic classifier ensemble selection approach for noise data. Inform Sci 180(18):3402–3421
Article Google Scholar
Xiao J, Xie L, He CZ, Jiang XY (2012) Dynamic classifier ensemble model for customer classification with imbalanced class distribution. Expert Syst Appl 39(3):3668–3675
Article Google Scholar
He CZ (2005) Self-organising data mining and economic forecasting. Science Publish, Beijing
Google Scholar
Merz C, Murphy P (1995) UCI repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository.html
Friedman JH (2003) On multivariate goodness-of-fit and two-sample testing. In: Proceedings of Phystat 2003. SLAC, Stanford, CA, pp 1–3
Cortes C, Vapnik V (1995) Support vector networks. Mach Learn 20:273–297
MATH Google Scholar
Tsymbal A, Puuronen S, Patterson DW (2003) Ensemble feature selection with the simple Bayesian classification. Inform Fusion 4(2):87–100
Article Google Scholar
Doumpos M, Zopounidis C (2004) A multicriteria classification approach based on pairwise comparisons. Eur J Oper Res 158(2):378–389
Article MATH MathSciNet Google Scholar
Van den Poel D, Buckinx W (2005) Predicting online-purchasing behaviour. Eur J Oper Res 166(2):557–575
Article MATH Google Scholar
McNemar Q (1947) Note on the sampling error of differences between correlated proportions and percentages. Psychometrica 12:153–157
Article Google Scholar

Download references

Acknowledgments

Thanks to the anonymous reviewers and the editor for helpful comments on earlier version of this paper. This research is partly supported by the Natural Science Foundation of China under Grant Nos. 71101100, 70731160635, and 71273036, New Teachers’ Fund for Doctor Stations, Ministry of Education under Grant No. 20110181120047, Excellent Youth fund of Sichuan University under Grant No. 2013SCU04A08, China Postdoctoral Science Foundation under Grant Nos. 2011M500418, 2012T50148 and 2013M530753, Frontier and Cross-innovation Foundation of Sichuan University under Grant No. skqy201352, Soft Science Foundation of Sichuan Province under Grant No. 2013ZR0016, Humanities and Social Sciences Youth Foundation of the Ministry of Education of PR China under Grant No. 11YJC870028, and Selfdetermined Research Funds of CCNU from the Colleges’ Basic Research and Operation of MOE under Grant No. CCNU13F030.

Author information

Authors and Affiliations

Business School, Sichuan University, Chengdu, 610064, China
Jin Xiao
School of Information Management, Central China Normal University, Wuhan, 430079, China
Yi Xiao
School of Economics and Management, Beihang University, Beijing, 100083, China
Anqiang Huang
Management Faculty, Chengdu University of Information Technology, Chengdu, 610103, China
Dunhu Liu
Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
Shouyang Wang

Authors

Jin Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Yi Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Anqiang Huang
View author publications
You can also search for this author in PubMed Google Scholar
Dunhu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shouyang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shouyang Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xiao, J., Xiao, Y., Huang, A. et al. Feature-selection-based dynamic transfer ensemble model for customer churn prediction. Knowl Inf Syst 43, 29–51 (2015). https://doi.org/10.1007/s10115-013-0722-y

Download citation

Received: 20 November 2012
Revised: 11 November 2013
Accepted: 04 December 2013
Published: 16 January 2014
Issue Date: April 2015
DOI: https://doi.org/10.1007/s10115-013-0722-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature-selection-based dynamic transfer ensemble model for customer churn prediction

Abstract

Access this article

Similar content being viewed by others

Research on telecom customer churn prediction based on ensemble learning

One-Step Classifier Ensemble Model for Customer Churn Prediction with Imbalanced Class

A New Hybrid Feature Selection-Classification Method to Identify Churned Customers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Feature-selection-based dynamic transfer ensemble model for customer churn prediction

Abstract

Access this article

Similar content being viewed by others

Research on telecom customer churn prediction based on ensemble learning

One-Step Classifier Ensemble Model for Customer Churn Prediction with Imbalanced Class

A New Hybrid Feature Selection-Classification Method to Identify Churned Customers

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation