research-article

Investigating Decision Tree in Churn Prediction with Class Imbalance

Authors:
Bing Zhu

Business School, Sichuan University, Chengdu, China

Business School, Sichuan University, Chengdu, China
View Profile

,
Guicai Xie

Business School, Sichuan University, Chengdu, China

Business School, Sichuan University, Chengdu, China
View Profile

,
Yuan Yuan

Business School, Sichuan University, Chengdu, China

Business School, Sichuan University, Chengdu, China
View Profile

,
Yiqin Duan

Business School, Sichuan University, Chengdu, China

Business School, Sichuan University, Chengdu, China
View Profile

ICDPA 2018: Proceedings of the International Conference on Data Processing and ApplicationsMay 2018Pages 11–15https://doi.org/10.1145/3224207.3224217

Published:12 May 2018Publication History

ICDPA 2018: Proceedings of the International Conference on Data Processing and Applications

Pages 11–15

ABSTRACT

Class imbalance presents significant challenges to customer churn prediction. Traditional machine learning algorithms like decision tree tend to be biased towards majority class. In this paper, we comprehensively study the performance of decision tree in churn prediction with class imbalance. We investigate the issue of pruning setting and optimal sampling strategy based on a recently developed expected maximum profit criterion. The experiments present some different conclusions from the previous research when the area under the ROC curve is used and the optimal sampling strategy are recommended. Our findings provides a useful guideline for usage of decision tree in churn prediction.

References

Tamaddoni Jahromi, A., Stakhovych, S., and Ewing, M. 2014. Managing B2B customer churn, retention and profitability. Industrial Marketing Management. 43, 7, 1258--1268.Google ScholarCross Ref
M. Colgate and P. Danaher. 2000. Implementing a customer relationship strategy: the asymmetric impact of poor versus excellent execution. Journal of the Academy of Marketing Science. 28, 3, 375--387.Google ScholarCross Ref
Garcia, D.L., Nebot, A., and Vellido, A. 2017. Intelligent data analysis approaches to churn as a business problem: a survey. Knowledge and Information Systems. 51, 3, 719--744. Google ScholarDigital Library
Verbraken, T., Verbeke, W., and Baesens, B. 2013. A novel profit maximizing metric for measuring classification performance of customer churn prediction models. IEEE Transactions on Knowledge and Data Engineering. 25, 5, 961--973. Google ScholarDigital Library
Verbeke, W. Dejaeger, K, Martens, D, Hur, J, Baesens, B., et al. 2012. New insights into churn prediction in the telecommunication sector: a profit driven data mining approach. European Journal of Operational Research. 218, 1, 211--229.Google ScholarCross Ref
Keramati, A., Jafari-Marandi, R. Aliannejadi, M., Ahmadian, I, M. Mozaffari, M., Abbasi, U. 2014. Improved churn prediction in telecommunication industry using data mining techniques. Applied Soft Computing Journal. 994--1012. Google ScholarDigital Library
Maimon, L. R. O. 2008. Data Mining with decision trees: theory and applications. World Scientific Publishing Company. Google ScholarDigital Library
Haibo H, E.A., and Garcia, E.A. 2009. Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering. 21, 9, 1263--1284. Google ScholarDigital Library
Chawla, N.V., Bowyer, K.W., Hall, L.O., and Kegelmeyer, W.P. 2002. SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research. 16, 3, 321--357. Google ScholarCross Ref
Ali, O., and Ariturk, U. 2014. Dynamic churn prediction framework with more effective use of rare event data: the case of private banking. Expert Systems with Applications. 41, 17, 7889--7903. Google ScholarDigital Library
Correa Bahnsen, A., Aouada, D., and Ottersten, B. 2015. Example-dependent cost-sensitive decision trees. Expert Systems with Applications. 42, 19, 6609--6619. Google ScholarDigital Library
Chawla, N.V. 2003. C4.5 and imbalanced data sets: investigating the effect of sampling method, probabilistic estimate, and decision tree structure. In Proceedings of the ICML2003.Google Scholar
Weiss, G.M., and Provost, F. 2003. Learning when training data are costly: the effect of class distribution on tree induction. Journal of Artificial Intelligence Research. 19, 315--354. Google ScholarCross Ref
Raeder, T., Forman, G., and Chawla, N.V. 2012. Learning from imbalanced data: evaluation matters. Data Mining: Found. Intell. Paradigms Springer. 315--331.Google Scholar

Index Terms

Investigating Decision Tree in Churn Prediction with Class Imbalance
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Application of Active Learning for Churn Prediction with Class Imbalance
ICMLT '18: Proceedings of the 2018 International Conference on Machine Learning Technologies

Churn prediction is a major focus that all the companies need to concern. Many studies have shown that class imbalance has a significant impact on churn prediction, but there is still no consensus on which technique is the best to cope with this issue. ...
Read More
An empirical comparison of techniques for the class imbalance problem in churn prediction

State-of-the-art solutions to class imbalance in churn prediction are compared.An experimental evaluation is done with 21 techniques and 11 real-world data sets.The expected maximum profit measure is used together with AUC and top-decile lift.Results ...
Read More
Decision tree induction based on minority entropy for the class imbalance problem

Most well-known classifiers can predict a balanced data set efficiently, but they misclassify an imbalanced data set. To overcome this problem, this research proposes a new impurity measure called minority entropy, which uses information from the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ICDPA 2018: Proceedings of the International Conference on Data Processing and Applications
May 2018
73 pages
ISBN:9781450364188
DOI:10.1145/3224207

Copyright © 2018 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 May 2018
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
Churn prediction
class imbalance
decision tree
expected maximum profit measure
Qualifiers
- research-article
- Research
- Refereed limited
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 2
  Total Citations
  View Citations
- 218
  Total Downloads
- Downloads (Last 12 months)7
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Investigating Decision Tree in Churn Prediction with Class Imbalance

ICDPA 2018: Proceedings of the International Conference on Data Processing and Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Application of Active Learning for Churn Prediction with Class Imbalance

An empirical comparison of techniques for the class imbalance problem in churn prediction

Decision tree induction based on minority entropy for the class imbalance problem

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Investigating Decision Tree in Churn Prediction with Class Imbalance

ICDPA 2018: Proceedings of the International Conference on Data Processing and Applications

ABSTRACT

References

Cited By

Index Terms

Recommendations

Application of Active Learning for Churn Prediction with Class Imbalance

An empirical comparison of techniques for the class imbalance problem in churn prediction

Decision tree induction based on minority entropy for the class imbalance problem

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media