Skip to main content
Log in

K- local maximum margin feature extraction algorithm for churn prediction in telecom

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

Telecom customer churn data is not publicly available because involving users’ personal privacy. In 2009, the French telecommunications company Orange for knowledge discovery and data mining (KDD) competition provides a telecom customer churn data set KDD Cup 09. In order to solve the high dimensional problem of KDD Cup 09, a new feature reduction method is used to explore the influence of different features on the prediction of classification model. In this paper, a new K- local maximum margin feature extraction algorithm (KLMM) is proposed. Through researching on the diversification subspace partition rules, the corresponding potential field structure is constructed. According to the data source in the dimension of scalability, the intrinsic link between data attributes and classification results is revealed. The extracted features can reduce the dimension of the churn prediction in telecom data. The KLMM method adapts auto selection sigma factor to reflect the anisotropy of features. The potential function is used to assess the weights of attributes and find the potential important weight. Experiments and analysis show that the extracted features by KLMM are more likely to find a classification hyperplane which can separate data points of the different classes.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Xu, H., Zhang, Z., Zhang, Y.: Churn prediction in telecom using a hybrid two-phase feature selection method[C] international symposium on intelligent information technology application. 576–579 (2009)

  2. Idris, A., Khan, A., Lee, Y.S.: Intelligent churn prediction in telecom: employing mRMR feature selection and RotBoost based ensemble classification. Appl. Intell. 39(3), 659–672 (2013)

    Article  Google Scholar 

  3. Fathian, M., Hoseinpoor, Y., Minaei-Bidgoli, B.: Offering a hybrid approach of data mining to predict the customer churn based on bagging and boosting methods. Kybernetes 45(5), 732–743 (2016)

    Article  MathSciNet  Google Scholar 

  4. Idris, A., Khan, A., Lee, Y.S.: Intelligent churn prediction in telecom: employing mrmr feature selection and rotboost based ensemble classification. Appl. Intell. 39(3), 659–672 (2013)

    Article  Google Scholar 

  5. Xiao, J., Jiang, X., He, C., Teng, G.: Churn prediction in customer relationship management via gmdh-based multiple classifiers ensemble. IEEE Intell. Syst. 31(2), 37–44 (2016)

    Article  Google Scholar 

  6. Yang, B., Xu, J., Yang, J., Li, M.: Localization algorithm in wireless sensor networks based on semi-supervised manifold learning and its application. Clus. Comput. 13(4), 435–446 (2010)

    Article  Google Scholar 

  7. Mirebeau, J.M.: Anisotropic fast-marching on cartesian grids using lattice basis reduction. Siam J. Numer. Anal. 52, 1573–1599 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  8. Daniel, S.F., Connolly, A., Schneider, J., Vanderplas, J., Xiong, L.: Classification of stellar spectra with local linear embedding. Astron. J. 142(6), 557–561 (2011)

    Article  Google Scholar 

  9. Irion, J., Saito, N.: Hierarchical graph laplacian eigen transforms. Jsiam Lett. 6, 21–24 (2014)

    Article  MathSciNet  Google Scholar 

  10. Li, B., Zheng, C.H., Huang, D.S.: Locally linear discriminant embedding: an efficient method for face recognition. Pattern Recogn. 41(12), 3813–3821 (2008)

    Article  MATH  Google Scholar 

  11. Li, J.B., Pan, J.S., Chu, S.C.: Kernel class-wise locality preserving projection. Inf. Sci. 178(7), 1825–1835 (2008)

    Article  MATH  Google Scholar 

  12. Monge, D.A., Holec, M., Železný, F., Garino, C.G.: Ensemble learning of runtime prediction models for gene-expression analysis workflows. Clus. Comput. 18(4), 1317–1329 (2015)

    Article  Google Scholar 

  13. Kwak, N.: Nonlinear projection trick in kernel methods: an alternative to the kernel trick. IEEE Trans. Neural Netw. Learn. Syst. 24(12), 2113 (2013)

    Article  Google Scholar 

  14. Jang, J., Lee, Y., Lee, S., Shin, S., Kim, D., Rim, H.: A novel density-based clustering method using word embedding features for dialogue intention recognition. Clust. Comput. 19, 2315–2326 (2016)

    Article  Google Scholar 

  15. Yang, J., Zhang, L., Yang, J.Y., Zhang, D.: From classifiers to discriminators: a nearest neighbor rule induced discriminant analysis. Pattern Recogn. 44(7), 1387–1402 (2011)

    Article  MATH  Google Scholar 

  16. Villegas, M., Paredes, R.: Dimensionality reduction by minimizing nearest-neighbor classification error. Pattern Recogn. Lett. 32(4), 633–639 (2011)

    Article  Google Scholar 

  17. Guyon, I., Lemaire, V., Dror, G., Vogel, D.: Design and analysis of the kdd cup 2009: fast scoring on a large orange customer database. ACM Sigkdd Explor. Newslett. 11(2), 68–76 (2010)

    Article  Google Scholar 

  18. Rodan, A., Faris, H., Al-Sakran, J., Al-Kadi, O.: A support vector machine approach for churn prediction in telecom industry. Int. J. Inf. 17(8), 3961 (2014)

    Google Scholar 

  19. Li, D., Wang, S., Gan, W., Li, D.: Data field for hierarchical clustering. Int. J. Data Warehous. Min. 7(4), 43–63 (2011)

    Article  Google Scholar 

  20. Li, C., Liu, Q., Dong, W., Wei, F., Zhang, X., Yang, L.: Max-margin-based discriminative feature learning. IEEE Trans. Neural Netw. Learning Syst. 27(12), 2768–2775 (2016)

    Article  Google Scholar 

  21. Yong-Zhi, L.I., Yang, J.Y., Zheng, Y.J., Xia, Y.Q.: New and efficient feature extraction methods based on maximum margin criterion. J. Syst. Simul. 19(5), 1061–1066 (2007)

    Google Scholar 

  22. Sang, Y.O., Chung, K.: Vocabulary optimization process using similar phoneme recognition and feature extraction. Clust. Comput. 19, 1683–1690 (2016)

    Article  Google Scholar 

  23. Zhu, Q., Feng, J., Huang, J.: Weighted natural neighborhood graph: an adaptive structure for clustering and outlier detection with no neighborhood parameter. Clust. Comput. 19(3), 1–13 (2016)

    Google Scholar 

  24. Yang, H.H., Moody, J.: Data visualization and feature selection: new algorithms for nongaussian data. Adv. Neural Inf. Process. Syst. 12, 687–693 (2000)

    Google Scholar 

  25. Meyer, P.E., Bontempi, G.: On the Use of Variable Complementarity for Feature Selection in Cancer Classification. Applications of Evolutionary Computing, Springer (2006)

    Book  Google Scholar 

  26. Lin, D., Tang, X.: (2006). Conditional Infomax Learning: an integrated framework for feature extraction and fusion. Computer vision - ECCV 2006, European Conference on Computer Vision, Graz, Austria, Proceedings vol. 3951, pp. 68–82. May 7–13 2006

  27. Bratko, I.: (2005). Machine learning based on attribute interactions: PhD dissertation

  28. Cheng, H., Qin, Z., Feng, C., Wang, Y., Li, F.: Conditional mutual information-based feature selection analyzing for synergy and redundancy. Etri J. 33(2), 210–218 (2011)

    Article  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundation of China (71271125, 61502260) and Natural Science Foundation of Shandong Province, China (ZR2011FM028).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Long Zhao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhao, L., Gao, Q., Dong, X. et al. K- local maximum margin feature extraction algorithm for churn prediction in telecom. Cluster Comput 20, 1401–1409 (2017). https://doi.org/10.1007/s10586-017-0843-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-017-0843-2

Keywords

Navigation