Skip to main content

A G-Means Update Ensemble Learning Approach for the Imbalanced Data Stream with Concept Drifts

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

  • 1278 Accesses

Abstract

Concept drift has become an important issue while analyzing data streams. Further, data streams can also have skewed class distributions, known as class imbalance. Actually, in the real world, it is likely that a data stream simultaneously has multiple concept drifts and an imbalanced class distribution. However, since most research approaches do not consider class imbalance and the concept drift problem at the same time, they probably have a good performance on the overall average accuracy, while the accuracy of the minority class is very poor. To deal with these challenges, this paper proposes a new weighting method which can further improve the accuracy of the minority class on the imbalanced data streams with concept drifts. The experimental results confirm that our method not only achieves an impressive performance on the average accuracy but also improves the accuracy of the minority class on the imbalanced data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Computer Science Department, Trinity College, Dublin (2004)

    Google Scholar 

  2. Kelly, M.G., Hand, D.J., Adams, N.M.: The Impact of changing populations on classifier performance. In: Knowledge Discovery and Data Mining, pp. 367–371 (1999)

    Google Scholar 

  3. João, G., Indrė, Ž., Albert, B., Mykola, P., Abdelhamid, B.: A survey on concept drift adaptation. ACM Comput. Surv. 46, 1–37 (2014)

    MATH  Google Scholar 

  4. Haibo, H., Edwardo, A.G.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21, 1263–1284 (2009)

    Article  Google Scholar 

  5. Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26, 405–425 (2014)

    Article  Google Scholar 

  6. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)

    MATH  Google Scholar 

  7. Blum, A.: Empirical support for winnow and weighted-majority algorithms: resultson a calendar scheduling domain. Mach. Learn. 26(1), 5–23 (1997)

    Article  Google Scholar 

  8. Santos, S., Gonçalves Jr., P.M., Silva, G., de Barros, R.S.M.: Speeding up recovery from concept drifts. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part III. LNCS, vol. 8726, pp. 179–194. Springer, Heidelberg (2014)

    Google Scholar 

  9. Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS, vol. 6679, pp. 155–163. Springer, Heidelberg (2011)

    Chapter  Google Scholar 

  10. Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)

    Article  Google Scholar 

  11. Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 10(10), 1–13 (2013)

    Google Scholar 

  12. Shuo, W., Leandro, L.M., Xin, Y.: A learning framework for online class imbalance learning. In: Computational Intelligence and Ensemble Learning (CIEL), pp. 36–45 (2013)

    Google Scholar 

  13. Ghazikhani, A., Reza, M., Hadi, S.Y.: Recursive least square perceptron model for non-stationary and imbalanced data stream classification. Evolving Syst. 4, 119–131 (2013)

    Article  Google Scholar 

  14. Mirza, B., Zhiping, L., Kar-Ann, T.: Weighted online sequential extreme learning machine for class imbalance learning. Neural Process. Lett. 38, 465–486 (2013)

    Article  Google Scholar 

  15. Shuo, W., Leandro, L.M., Xin, Y.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)

    Article  Google Scholar 

  16. Oza, N.C., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–364. ACM (2001)

    Google Scholar 

  17. Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)

    Google Scholar 

  18. Street, W.N., Kim, Y.: A streaming ensemble algorithm SEA for large-scale classification. In: Lee, D., Schkolnick, M., Provost, F.J., Srikant, R. (eds.) KDD, pp. 377–382. ACM (2001)

    Google Scholar 

  19. Oza, N.C., Russell, S.: Online bagging and boosting. In: Artificial Intelligence and Statistics 2001, pp. 105–112. Morgan Kaufmann (2001)

    Google Scholar 

  20. Kelly, M.G., Hand, D.J., Adams, N.M.: The impact of changing populations on classifier performance. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 367–371. ACM (1999)

    Google Scholar 

  21. Harries, M., Wales, N.S.: SPLICE-2 Comparative Evaluation: Electricity Pricing (1999)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sin-Kai Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Wang, SK., Dai, BR. (2016). A G-Means Update Ensemble Learning Approach for the Imbalanced Data Stream with Concept Drifts. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43946-4_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43945-7

  • Online ISBN: 978-3-319-43946-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics