Skip to main content

Local PCA Regression for Missing Data Estimation in Telecommunication Dataset

  • Conference paper
PRICAI 2010: Trends in Artificial Intelligence (PRICAI 2010)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 6230))

Included in the following conference series:

Abstract

The customer churn problem affects hugely the telecommunication services in particular, and businesses in general. Note that in majority of cases the number of potential customer churn is much smaller than the non-churners. Therefore, the imbalance distribution of samples between churners and non-churners is a concern when building a churn prediction model. This paper presents a Local PCA approach to solve imbalance classification problem by generating new churn samples. The experiments were carried out on a large real-world Telecommunication dataset and assessed on a churn prediction task. The experiments showed that the Local PCA along with Smote outperformed Linear regression and Standard PCA data generation techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Au, W., Chan, C.C., Yao, X.: A novel evolutionary data mining algorithm with applications to churn prediction. IEEE Transactions on Evolutionary Computation 7, 532–545 (2003)

    Article  Google Scholar 

  2. Bradley, A.P.: The use of the area under the roc curve in the evaluation of machine learning algorithms. Pattern Recognition 30, 1145–1159 (1997)

    Article  Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kergelmeyer, W.P.: Smote: synthetic minority over-sampling technique. JAIR 16, 321–357 (2002)

    MATH  Google Scholar 

  4. Chawla, N.V., Japkowicz, N., Kotcz, A.: Editorial: special issue on learning from imbalanced data sets. SIGKDD Explor. Newsl. 6(1), 1–6 (2004)

    Article  Google Scholar 

  5. Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning. Kluwer Academic Publishers, Dordrecht (1989)

    MATH  Google Scholar 

  6. Huang, B.Q., Kechadi, M.-T., Buckley, B.: Customer churn prediction for broad-band internet services. In: Pedersen, T.B., Mohania, M.K., Tjoa, A.M. (eds.) DaWaK 2009. LNCS, vol. 5691, pp. 229–243. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  7. Jolliffe, I.T.: Principal Components Analysis. Springer, Heidelberg (1986)

    Google Scholar 

  8. Wei, C., Chiu, I.: Turning telecommunications call details to churn prediction: a data mining approach. Expert Systems with Applications 23, 103–112 (2002)

    Article  Google Scholar 

  9. Wilson, D.L.: Asymptotic properties of nearest neighbor rules using edited data. IEEE Transactions on Systems, Man and Communications 2(3), 408–421 (1972)

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Sato, T., Huang, B.Q., Huang, Y., Kechadi, M.T. (2010). Local PCA Regression for Missing Data Estimation in Telecommunication Dataset. In: Zhang, BT., Orgun, M.A. (eds) PRICAI 2010: Trends in Artificial Intelligence. PRICAI 2010. Lecture Notes in Computer Science(), vol 6230. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15246-7_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-15246-7_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-15245-0

  • Online ISBN: 978-3-642-15246-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics