A G-Means Update Ensemble Learning Approach for the Imbalanced Data Stream with Concept Drifts

Wang, Sin-Kai; Dai, Bi-Ru

doi:10.1007/978-3-319-43946-4_17

Sin-Kai Wang¹⁵ &
Bi-Ru Dai¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

International Conference on Big Data Analytics and Knowledge Discovery

1278 Accesses

Abstract

Concept drift has become an important issue while analyzing data streams. Further, data streams can also have skewed class distributions, known as class imbalance. Actually, in the real world, it is likely that a data stream simultaneously has multiple concept drifts and an imbalanced class distribution. However, since most research approaches do not consider class imbalance and the concept drift problem at the same time, they probably have a good performance on the overall average accuracy, while the accuracy of the minority class is very poor. To deal with these challenges, this paper proposes a new weighting method which can further improve the accuracy of the minority class on the imbalanced data streams with concept drifts. The experimental results confirm that our method not only achieves an impressive performance on the average accuracy but also improves the accuracy of the minority class on the imbalanced data streams.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A comprehensive ensemble classification techniques detecting and managing concept drift in dynamic imbalanced data streams

Article 23 April 2024

Dynamic ensemble selection classification algorithm based on window over imbalanced drift data stream

Article 27 November 2022

Ensemble framework for concept drift detection and class imbalance in data streams

Article 04 May 2024

References

Tsymbal, A.: The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Computer Science Department, Trinity College, Dublin (2004)
Google Scholar
Kelly, M.G., Hand, D.J., Adams, N.M.: The Impact of changing populations on classifier performance. In: Knowledge Discovery and Data Mining, pp. 367–371 (1999)
Google Scholar
João, G., Indrė, Ž., Albert, B., Mykola, P., Abdelhamid, B.: A survey on concept drift adaptation. ACM Comput. Surv. 46, 1–37 (2014)
MATH Google Scholar
Haibo, H., Edwardo, A.G.: Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21, 1263–1284 (2009)
Article Google Scholar
Barua, S., Islam, M.M., Yao, X., Murase, K.: MWMOTE-majority weighted minority oversampling technique for imbalanced data set learning. IEEE Trans. Knowl. Data Eng. 26, 405–425 (2014)
Article Google Scholar
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res. 8, 2755–2790 (2007)
MATH Google Scholar
Blum, A.: Empirical support for winnow and weighted-majority algorithms: resultson a calendar scheduling domain. Mach. Learn. 26(1), 5–23 (1997)
Article Google Scholar
Santos, S., Gonçalves Jr., P.M., Silva, G., de Barros, R.S.M.: Speeding up recovery from concept drifts. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014, Part III. LNCS, vol. 8726, pp. 179–194. Springer, Heidelberg (2014)
Google Scholar
Brzeziński, D., Stefanowski, J.: Accuracy updated ensemble for data streams with concept drift. In: Corchado, E., Kurzyński, M., Woźniak, M. (eds.) HAIS 2011, Part II. LNCS, vol. 6679, pp. 155–163. Springer, Heidelberg (2011)
Chapter Google Scholar
Minku, L.L., Yao, X.: DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng. 24(4), 619–633 (2012)
Article Google Scholar
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neural Netw. Learn. Syst. 10(10), 1–13 (2013)
Google Scholar
Shuo, W., Leandro, L.M., Xin, Y.: A learning framework for online class imbalance learning. In: Computational Intelligence and Ensemble Learning (CIEL), pp. 36–45 (2013)
Google Scholar
Ghazikhani, A., Reza, M., Hadi, S.Y.: Recursive least square perceptron model for non-stationary and imbalanced data stream classification. Evolving Syst. 4, 119–131 (2013)
Article Google Scholar
Mirza, B., Zhiping, L., Kar-Ann, T.: Weighted online sequential extreme learning machine for class imbalance learning. Neural Process. Lett. 38, 465–486 (2013)
Article Google Scholar
Shuo, W., Leandro, L.M., Xin, Y.: Resampling-based ensemble methods for online class imbalance learning. IEEE Trans. Knowl. Data Eng. 27(5), 1356–1368 (2015)
Article Google Scholar
Oza, N.C., Russell, S.: Experimental comparisons of online and batch versions of bagging and boosting. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 359–364. ACM (2001)
Google Scholar
Bifet, A., Holmes, G., Kirkby, R., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11, 1601–1604 (2010)
Google Scholar
Street, W.N., Kim, Y.: A streaming ensemble algorithm SEA for large-scale classification. In: Lee, D., Schkolnick, M., Provost, F.J., Srikant, R. (eds.) KDD, pp. 377–382. ACM (2001)
Google Scholar
Oza, N.C., Russell, S.: Online bagging and boosting. In: Artificial Intelligence and Statistics 2001, pp. 105–112. Morgan Kaufmann (2001)
Google Scholar
Kelly, M.G., Hand, D.J., Adams, N.M.: The impact of changing populations on classifier performance. In: Proceedings of the Fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 367–371. ACM (1999)
Google Scholar
Harries, M., Wales, N.S.: SPLICE-2 Comparative Evaluation: Electricity Pricing (1999)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology, Taipei, Taiwan, ROC
Sin-Kai Wang & Bi-Ru Dai

Authors

Sin-Kai Wang
View author publications
You can also search for this author in PubMed Google Scholar
Bi-Ru Dai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sin-Kai Wang .

Editor information

Editors and Affiliations

University of Science and Technology , Rolla, Missouri, USA
Sanjay Madria
Osaka University , Osaka, Japan
Takahiro Hara

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, SK., Dai, BR. (2016). A G-Means Update Ensemble Learning Approach for the Imbalanced Data Stream with Concept Drifts. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-43946-4_17
Published: 06 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43945-7
Online ISBN: 978-3-319-43946-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics