A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority

Sidhu, Parneeta; Bhatia, M. P. S.

doi:10.1007/s13042-017-0738-9

A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority

Original Article
Published: 02 November 2017

Volume 10, pages 563–578, (2019)
Cite this article

International Journal of Machine Learning and Cybernetics Aims and scope Submit manuscript

Parneeta Sidhu¹ &
M. P. S. Bhatia¹

413 Accesses
10 Citations
Explore all metrics

Abstract

We present an ensemble system, recurring dynamic weighted majority (RDWM) that maintains two ensembles of experts, so as to accurately handle drifting concepts mainly recurrent drifts. The primary online ensemble represents the present concepts and the secondary ensemble represents the old concepts since the beginning of learning. An effective pruning methodology helps to remove redundant and old classifiers, which may have otherwise caused interference in learning the new concepts. Experimental evaluation using datasets proves that RDWM achieves very high generalization accuracy, irrespective of the speed or severity of drift; or presence of noise in the dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

Vitor Werner de Vargas, Jorge Arthur Schneider Aranda, … Jorge Luis Victória Barbosa

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

Article 30 August 2019

Xibin Dong, Zhiwen Yu, … Qianli Ma

References

Baena-Garcia M, Campo-Avila JD, Fidalgo R, Bifet A (2006) Early drift detection method. In: Proc. 4th ECML PKDD Int’l Workshop Knowled. Discovery from Data Streams, pp 77–86
Bifet A, Holmes G, Kirkby R, Pfahringer B (2010) MOA: massive online analysis, a framework for stream classification and clustering, JMLR: workshop and conference proceedings, vol 11, p 44
Gama J, Žliobaitė I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):1–37
Article Google Scholar
Dawid A, Vovk V (1999) Prequential probability: principles and proper ties. Bernoulli 5(1):125–162
Article MathSciNet Google Scholar
Dietterich TG (1997) Machine learning research: four current directions. Artif Intell 18(4):97–136
Google Scholar
Gama J, Medas P, Castillo G, Rodrigues P (2004) Learning with drift detection, SBIA’04, pp 286–295
Patrick PK, Yeung DS, Ng WWY et al (2012) Dynamic fusion method using localized generalization error model. Inf Sci 217:1–20
Article Google Scholar
Harries M (1999) Splice-2 comparative evaluation: electricity pricing, Technical report. University of New South Wales, Australia, July 1999
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. KDD, San Francisco, pp 97–106
Google Scholar
Garcı´a S, Ferna´ndez A, Luengo J et al (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power. Inf Sci 180(10):2044–2064
Article Google Scholar
Ramamurthy S, Bhatnagar R (2007) Tracking recurrent concept drift in streaming data Using ensemble classifiers. ICMLA’07, pp 404–409
Kolter JZ, Maloof MA (2007) Dynamic weighted majority: an ensemble method for drifting concepts. JMLR 8:2755–2790
MATH Google Scholar
Littlestone N, Warmuth M (1994) The weighted majority algorithm. Inform Comput 108:212–261
Article MathSciNet Google Scholar
Oza N, Russell S (2001) Online bagging and boosting. In: Artificial intelligence and statistics 2001”. Morgan Kaufmann, pp 105–112
Gomes J, Menasalvas E, Sousa P (2011) Learning recurring concepts from data streams with a context-aware ensemble. In: ACM Symp. on Applied Computing, pp 994–999
Minku LL, Yao X (2012) DDD: a new ensemble approach for dealing with concept drift. ITKDE 24(4):619
Google Scholar
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
MathSciNet MATH Google Scholar
Nishida K, Yamauchi K (2007) Adaptive classifiers-ensemble system for tracking concept drift. In: ICMLC’07, pp 3607–3612
Kumar Y, Sahoo G (2015) Hybridization of magnetic charge system search and particle swarm optimization for efficient data clustering using neighborhood search strategy. Soft Comput 19(12):3621–3645
Article Google Scholar
Nishida K, Yamauchi K, Omori T (2005) ACE: Adaptive classifiers-ensemble system for concept-drifting environments. In: 6th Int’l Workshop on Multiple Classifier Systems, ser. LNCS, vol 3541, pp 176–185
Oza NC, Russell S (2001) Experimental comparisons of online and batch versions of bagging and boosting. In: Proc. of Seventh ACM SIGKDD’01. ACM, NY, pp 359–364
Google Scholar
Bach SH, Maloof MA (2008). Paired learners for concept drift. ICDM’08, Los Alamitos, pp 23–32
Bifet A, Gavaldà R (2007) Learning from time-changing data with adaptive windowing. In: SDM’07. SIAM, Florida, pp 443–448
Google Scholar
Kumar Y, Sahoo G (2015) A two-step artificial bee colony algorithm for clustering. NCAA, pp 1–15
Zhu X (2010) Stream data mining repository. http://www.cse.fau.edu/~xqzhu/stream.html. Accessed 13 Mar 2016
Schlimmer JC, Granger RH (1986) Incremental learning from noisy data. Mach Learn 1(3):317–354
Google Scholar
Alippi C, Boracchi G, Roveri M (2013) Just-in-time classifiers for recurrent concepts. IEEE Trans Neural Netw Learn Syst 24(4):620–634
Article Google Scholar
Hosseini M, Ahmadi Z, Beigy H (2013) Using a classifier pool in accuracy based tracking of recurring concepts in data stream classification. ES 4:43–60
Google Scholar
Gama J, Sebastiao R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms. In: ACM SIGKDD’09, pp 329–338
Sidhu P, Bhatia MPS (2015) A novel online ensemble approach to handle concept drifting data streams: diversified dynamic weighted majority, IJMLC. Springer, Berlin Heidelberg
Google Scholar
Asuncion A, Newman DJ (2007) UCI machine learning repository. University of California, Irvine
Google Scholar
Gama J, Sebastião R, Rodrigues PP (2009) Issues in evaluation of stream learning algorithms. In: KDD’09, pp 329–338
Daniel, Wayne W (1990). Friedman two-way analysis of variance by ranks. Applied nonparametric statistics, 2nd edn. PWS-Kent, Boston, pp 262–274. ISBN 0-534-91976-6
The UCI KDD (1999) Archive. http://mlr.cs.umass.edu/ml/databases /kddcup99/kddcup99.html. Accessed 10 May 2016
Wang X-z, Xing H-J, Li Y et al (2015) A study on relationship between generalization abilities and fuzziness of base classifiers in ensemble learning. IEEE Trans Fuzzy Syst 23(5):1638–1654
Article Google Scholar
Wang X, Rana A, Ai-Min F (2015) Fuzziness based sample categorization for classifier performance improvement. JIFS 29:1185–1196
MathSciNet Google Scholar
Ashfaq RAR, Wang XZ, Huang JZ, Abbas H, He YL (2017) Fuzziness based semi-supervised learning approach for intrusion detection system., Inf Sci 378:484–497
Article Google Scholar
Khan I, Huang JZ, Ivanov K (2016) Incremental density-based ensemble clustering over evolving data streams. Neurocomputing 191:34–43. doi:https://doi.org/10.1016/j.neucom.2016.01.009
Article Google Scholar

Download references

Author information

Authors and Affiliations

Division of CoE, Netaji Subhas Institute of Technology, Sec-3, Dwarka, New Delhi, 110078, India
Parneeta Sidhu & M. P. S. Bhatia

Authors

Parneeta Sidhu
View author publications
You can also search for this author in PubMed Google Scholar
M. P. S. Bhatia
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Parneeta Sidhu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sidhu, P., Bhatia, M.P.S. A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority. Int. J. Mach. Learn. & Cyber. 10, 563–578 (2019). https://doi.org/10.1007/s13042-017-0738-9

Download citation

Received: 04 August 2016
Accepted: 26 October 2017
Published: 02 November 2017
Issue Date: 01 March 2019
DOI: https://doi.org/10.1007/s13042-017-0738-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A two ensemble system to handle concept drifting data streams: recurring dynamic weighted majority

Abstract

Access this article

Similar content being viewed by others

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation