An ensemble method for data stream classification in the presence of concept drift

Abbaszadeh, Omid; Amiri, Ali; Khanteymoori, Ali Reza

doi:10.1631/FITEE.1400398

An ensemble method for data stream classification in the presence of concept drift

Published: 10 December 2015

Volume 16, pages 1059–1068, (2015)
Cite this article

Frontiers of Information Technology & Electronic Engineering Aims and scope Submit manuscript

Omid Abbaszadeh¹,
Ali Amiri¹ &
Ali Reza Khanteymoori¹

185 Accesses
5 Citations
Explore all metrics

Abstract

One recent area of interest in computer science is data stream management and processing. By ‘data stream’, we refer to continuous and rapidly generated packages of data. Specific features of data streams are immense volume, high production rate, limited data processing time, and data concept drift; these features differentiate the data stream from standard types of data. An issue for the data stream is classification of input data. A novel ensemble classifier is proposed in this paper. The classifier uses base classifiers of two weighting functions under different data input conditions. In addition, a new method is used to determine drift, which emphasizes the precision of the algorithm. Another characteristic of the proposed method is removal of different numbers of the base classifiers based on their quality. Implementation of a weighting mechanism to the base classifiers at the decision-making stage is another advantage of the algorithm. This facilitates adaptability when drifts take place, which leads to classifiers with higher efficiency. Furthermore, the proposed method is tested on a set of standard data and the results confirm higher accuracy compared to available ensemble classifiers and single classifiers. In addition, in some cases the proposed classifier is faster and needs less storage space.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Ensemble Classification Algorithm Based on Information Entropy for Data Streams

Article 06 February 2019

Junhong Wang, Shuliang Xu, … Jiye Liang

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Investigation on Aggregated Weighted Ensemble Framework for Data Stream Classification

References

Baena-García, M., del Campo-Ávila, J., Fidalgo, R., et al., 2006. Early drift detection method. ECML PKDD.
Google Scholar
Bifet, A., 2009. Adaptive learning and mining for data streams and frequent patterns. ACM SIGKDD Explor. Newsl., 11(1):55–56. [doi:10.1145/1656274.1656287]
Article Google Scholar
Bifet, A., Holmes, G., Kirkby, R., et al., 2010. MOA: massive online analysis. J. Mach. Learn. Res., 11:1601–1604.
Google Scholar
Brzezinski, D., Stefanowski, J., 2014. Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Trans. Neur. Netw. Learn. Syst., 25(1):81–94. [doi:10.1109/TNNLS.2013.2251352]
Article Google Scholar
Gama, J., 2010. Knowledge Discovery from Data Streams. Chapman & Hall/CRC, London.
Book MATH Google Scholar
Gama, J., Medas, P., Castillo, G., et al., 2004. Learning with drift detection. Brazilian Symp. on Artificial Intelligence, p.286–295. [doi:10.1007/978-3-540-28645-5_29]
Google Scholar
Hulten, G., Spencer, L., Domingos, P., 2001. Mining timechanging data streams. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery Data Mining, p.97–106. [doi:10.1145/502512.502529]
Google Scholar
Jiang, T., Feng, Y.C., Zhang, B., et al., 2009. Monitoring correlative financial data streams by local pattern similarity. J. Zhejiang Univ.-Sci. A, 10(7):937–951. [doi:10.1631/jzus.A0820445]
Article MATH Google Scholar
Kolter, J.Z., Maloof, M.A., 2007. Dynamic weighted majority: an ensemble method for drifting concepts. J. Mach. Learn. Res., 8:2755-2790.
Google Scholar
Kuncheva, L.I., 2004. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons, Hoboken.
Book Google Scholar
Minku, L.L., Yao, X., 2012. DDD: a new ensemble approach for dealing with concept drift. IEEE Trans. Knowl. Data Eng., 24(4):619–633. [doi:10.1109/TKDE.2011.58]
Article Google Scholar
Oza, N.C., 2005. Online bagging and boosting. IEEE Int. Conf. on System and Man Cybernetics, p.2340–2345. [doi:10.1109/ICSMC.2005.1571498]
Google Scholar
Ruping, S., 2001. Incremental learning with support vector machines. IEEE 13th Int. Conf. on Data Mining, p.641–642. [doi:10.1109/ICDM.2001.989589]
Google Scholar
Sim, J., Wright, C.C., 2005. The kappa statistic in reliability studies: use, interpretation, and sample size requirements. Phys. Ther., 85(3):257–268.
Google Scholar
Street, W.N., Kim, Y.S., 2001. A streaming ensemble algorithm (SEA) for large-scale classification. Proc. 7th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.377–382. [doi:10.1145/502512.502568]
Google Scholar
Tsymbal, A., Pechenizkiy, M., Cunningham, P., et al., 2008. Dynamic integration of classifiers for handling concept drift. Inform. Fus., 9(1):56–68. [doi:10.1016/j.inffus.2006.11.002]
Article Google Scholar
Wang, H., Fan, W., Yu, P.S., et al., 2003. Mining concept-drifting data streams using ensemble classifiers. Proc. 9th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.226–235. [doi:10.1145/956750.956778]
Google Scholar
Xu, W.H., Qin, Z., Chang, Y., 2011. Clustering feature decision trees for semi-supervised classification from high-speed data streams. J. Zhejiang Univ.-Sci. C (Comput. & Electron.), 12(8):615–628. [doi:10.1631/jzus.C1000330]
Article Google Scholar
Zhang, P., Zhu, X., Shi, Y., 2008. Categorizing and mining concept drifting data streams. Proc. 14th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, p.812–820. [doi:10.1145/1401890.1401987]
Google Scholar
Zhang, P., Zhou, C., Wang, P., et al., 2015. E-tree: an efficient indexing structure for ensemble models on data streams. IEEE Trans. Knowl. Data Eng., 27(2):461–474. [doi:10.1109/TKDE.2014.2298018]
Article Google Scholar
Zhu, X., Zhang, P., Lin, X., et al., 2010. Active learning from stream data using optimal weight classifier ensemble. IEEE Trans. Syst. Man Cybern. B, 40(6):1607–1621. [doi:10.1109/TSMCB.2010.2042445]
Article Google Scholar
Žliobaite, I., 2009. Learning under Concept Drift: an Overview. Technical Report. Vilnius University, Lithuania.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, University of Zanjan, Zanjan, 45371-38791, Iran
Omid Abbaszadeh, Ali Amiri & Ali Reza Khanteymoori

Authors

Omid Abbaszadeh
View author publications
You can also search for this author in PubMed Google Scholar
Ali Amiri
View author publications
You can also search for this author in PubMed Google Scholar
Ali Reza Khanteymoori
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ali Amiri.

Additional information

ORCID: Omid ABBASZADEH, http://orcid.org/0000-0002-8923-940X

Rights and permissions

Reprints and permissions

About this article

Cite this article

Abbaszadeh, O., Amiri, A. & Khanteymoori, A.R. An ensemble method for data stream classification in the presence of concept drift. Frontiers Inf Technol Electronic Eng 16, 1059–1068 (2015). https://doi.org/10.1631/FITEE.1400398

Download citation

Received: 19 November 2014
Accepted: 15 April 2015
Published: 10 December 2015
Issue Date: December 2015
DOI: https://doi.org/10.1631/FITEE.1400398

Keywords

CLC number

TP391

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An ensemble method for data stream classification in the presence of concept drift

Abstract

Access this article

Similar content being viewed by others

An Ensemble Classification Algorithm Based on Information Entropy for Data Streams

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Investigation on Aggregated Weighted Ensemble Framework for Data Stream Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

CLC number

Navigation

An ensemble method for data stream classification in the presence of concept drift

Abstract

Access this article

Similar content being viewed by others

An Ensemble Classification Algorithm Based on Information Entropy for Data Streams

An Experimental Comparison of Ensemble Classifiers for Evolving Data Streams

Investigation on Aggregated Weighted Ensemble Framework for Data Stream Classification

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

CLC number

Search

Navigation