Abstract
Concept drift involving noise is an important research in the field of data mining. Many concept drift detection models are proposed to promote the research of traditional concept drift detection. In this paper, we propose an anti-noise concept drift processing algorithm based on entropy of information, named ACPJS. In ACPJS, the JS-divergence and Hoeffding Bounds are used to set double threshold for concept drift detection and subsequently a horizontal integrated model will be constructed for anti-noise concept drift processing. In the comparison experiments of multiple data sets, the presented algorithm has shown good performance in concept drift detection, anti-noise performance and classification accuracy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Xu, W., Qin, Z., Chang, Y.: Clustering feature decision trees for semi-supervised classification from high-speed data streams. J. Zhejiang Univ. Sci. C 12(8), 615–628 (2011)
Brzezinski, D., Stefanowski, J.: Reacting to different types of concept drift: the accuracy updated ensemble algorithm. IEEE Transact. Neural Netw. Learn. Syst. 25(1), 81–94 (2014)
Schlimmer, J.C., Granger, R.: Incremental learning from noisy data. Mach. Learn. 1(3), 317–354 (1986)
Gama, J., Zliobaite, I., Bifet, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 44 (2014)
Dasu, T., Krishnan, S., Venkatasubramanian, S.: An information-theoretic approach to detecting changes in multi-dimensional data streams. In: Proceedings of the Symposium on the Interface of Statistics, Computing Science, and Applications, pp. 1–24 (2006)
Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (2001)
Gama, J., Medas, P., Castillo, G.: Learning with drift detection. SBIA Braz. Symp. Artif. Intell. 3171(17), 286–295 (2004)
Domingos, P., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 71–80, Boston (2000)
Susnjak, T., Barczak, A.L.C., Hawick, K.A.: Adaptive cascade of boosted ensembles for face detection in concept drift. Neural Comput. Appl. 21(4), 671–682 (2011)
Scholz, M., Klinkenberg, R.: Boosting classifiers for drifting concepts. Intell. Data Anal. 11(1), 3–28 (2007)
Liu, A., Lu, J., Liu, F., Zhang, G.: Accumulating regional density dissimilarity for concept drift detection in data streams. Pattern Recognit. 76, 256–272 (2018)
Song, G., Ye, Y., Zhang, H., Xu, X., Lau, R.Y.K., Liu, F.: Dynamic clustering forest: an ensemble framework to efficiently classify textual data stream with concept drift. Inf. Sci. 357, 125–143 (2016)
Rad, R.H., Haeri, M.A.: Hybrid Forest: A Concept Drift Aware Data Stream Mining Algorithm (2019)
Benczúr, A.A., Kocsis, L., Pálovics, R.: Reinforcement learning, unsupervised methods, and concept drift in stream learning. In: Sakr, S., Zomaya, A. (eds.) Encyclopedia of Big Data Technologies, pp. 1–8. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-63962-8
De Mello, R.F., Vaz, Y., Grossi, C.H., Bifet, A.: On learning guarantees to unsupervised concept drift detection on data streams. Expert Syst. Appl. 117, 90–102 (2019)
Lavaire, J.D., Singh, A., Yousef, M., Singh, S., Yue, X.: Dimensional scalability of supervised and unsupervised concept drift detection: an empirical study. In: IEEE International Conference on Big Data. IEEE (2015)
Song, X., He, H., Niu, S., Gao, J.: A data streams analysis strategy based on hoeffding tree with concept drift on hadoop system. In: International Conference on Advanced Cloud Big Data. IEEE (2017)
Hulten, G., Spencer, L., Domingos, P.M.: Mining time-changing data streams. In: International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM, New York (2001)
Oza, N.C.: Online Bagging and Boosting. In: IEEE International Conference on Systems (2006)
Bifet, A., Holmes, G., Kirkby, R.B., Pfahringer, B.: MOA: massive online analysis. J. Mach. Learn. Res. 11(2), 1601–1604 (2010)
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. IEEE Computer Society (2003)
Bifet, A., Holmes, G., Pfahringer, B., Kirkby, R., Gavalda, R.: New ensemble methods for evolving data streams. In: Knowledge Discovery and Data Mining (2009)
Li, P., Hu, X., Wu, X.: Mining concept-drifting data streams with multiple semi-random decision trees. In: Tang, C., Ling, C.X., Zhou, X., Cercone, N.J., Li, X. (eds.) ADMA 2008. LNCS (LNAI), vol. 5139, pp. 733–740. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-88192-6_78
Zhu, Q., Hu, X., Zhang, Y., Li, P., Wu, X.: A double-window-based classification algorithm for concept drifting data streams, pp. 639–644 (2010)
Acknowledgment
The research work was supported by the National Natural Science Foundation of China under Grant No. 61603083, the Fundamental Research Funds of the Central Universities under Grant No. N162304009, N182303036, the Major Project of Science and Technology Research of Hebei University under Grant No. ZD2017303, and open research fund of State Key Laboratory of Management and Control for Complex Systems, Institute of Automation, Chinese Academy of Sciences under grant No. 20180105.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Song, X., Qin, S., Niu, S., Wang, Y. (2019). ACPJS: An Anti-noise Concept Drift Processing Algorithm Based on JS-divergence. In: Lu, H., Tang, H., Wang, Z. (eds) Advances in Neural Networks – ISNN 2019. ISNN 2019. Lecture Notes in Computer Science(), vol 11554. Springer, Cham. https://doi.org/10.1007/978-3-030-22796-8_47
Download citation
DOI: https://doi.org/10.1007/978-3-030-22796-8_47
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22795-1
Online ISBN: 978-3-030-22796-8
eBook Packages: Computer ScienceComputer Science (R0)