Abstract
Recently, mining from data streams has become an important and challenging task for many real-world applications such as credit card fraud protection and sensor networking. One popular solution is to separate stream data into chunks, learn a base classifier from each chunk, and then integrate all base classifiers for effective classification. In this paper, we propose a new dynamic classifier selection (DCS) mechanism to integrate base classifiers for effective mining from data streams. The proposed algorithm dynamically selects a single “best” classifier to classify each test instance at run time. Our scheme uses statistical information from attribute values, and uses each attribute to partition the evaluation set into disjoint subsets, followed by a procedure that evaluates the classification accuracy of each base classifier on these subsets. Given a test instance, its attribute values determine the subsets that the similar instances in the evaluation set have constructed, and the classifier with the highest classification accuracy on those subsets is selected to classify the test instance. Experimental results and comparative studies demonstrate the efficiency and efficacy of our method. Such a DCS scheme appears to be promising in mining data streams with dramatic concept drifting or with a significant amount of noise, where the base classifiers are likely conflictive or have low confidence.
Similar content being viewed by others
References
Ali K, Pazzani M (1996) Error reduction through learning multiple description. Machine Learning 24(3)
Blake C, Merz L (1998) UCI data repository
Breiman L, Friedman JH, Olshen RA, Stone CJ (1984) Classification and regression trees. Belmont, CA
Breiman L (1996) Stacked regressions. Machine Learning 24
Brodley CE (1995) Recursive automatic bias selection for classifier construction. Machine Learning 20: 63–94
Brodley CE, Friedl MA, Strahler AH (1996) New approaches to decision tree classification in remote sensing: Using homogeneous and hybrid decision trees to map land cover. In: Proc. of international Geosci. remote Sens. Symp., vol 1. Lincoln, Neb.,pp 532–534
Chan P (1996) An extensible meta-learning approach for scalable and accurate inductive learning. Ph.D thesis, Columbia Univ.
Ching JY, Wong AKC, Chan KCC (1995) Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Trans. PAMI 17(7)
Domingos P, Hulten G (2000) Mining high-speed data streams. In: Proc. of SIGKDD
Dougherty J, Kohavi R, Sahami M (1995) Supervised and unsupervised discretization of continuous features. In: Proc. of the international conference on Machine Learning, pp 194–202
Holte RC (1993) Very simple classification rules perform well on most commonly used datasets. Machine Learning 11
Huang YS, Suen CY (1995) A method of combining multiple experts for the recognition of unconstrained handwritten numerals. IEEE Trans. PAMI 17(1): 90–94
IBM Almaden Research, Synthetic data generator. http://www.almaden.ibm.com/software/quest/Resources/datasets/syndata.html
Jain AK, Dubes RC (1998) Algorithms for clustering data. Prentice Hall
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artificial Intelligence Journal, Special Issue on Relevance 97(1/2): 273–324
Kolter J, Maloof M (2003) Dynamic weighted majority: A new ensemble method for tracking concept drift. In: Proc. of ICDM
Kucheva LI (2002) Switching between selection and fusion in combining classifiers: An experiment. IEEE Trans. SMC 32(2)
Kuncheva LI, Whitaker CA, Shipp CA, Duin R (2000) Is independence good for combining classifiers? In: Proc. of 15th Int. conference on pattern recognition. Spain
Li YH, Jain AK (1998) Classification of text document. The Computer Journal 41(8)
Linde ABY, Gray RM (1980) An algorithm for vector quantization design. IEEE Trans. Communications COM-28: 84–95
Merz CJ (1996) Dynamical selection of learning algorithms. In: Fisher D, Lenz H-J (eds) Learning from data, artificial intelligence and statistics. Springer-Verlag, NY
Merz CJ (1999) Using correspondence analysis to combine classifiers. Machine Learning 36(1/2): 33–58
Nasraoui O, Cardona C, Rojas C, Gonzlez F (2003) TECNO-STREAMS: Tracking evolving clusters in noisy data streams with a scalable immune system learning model. In: Proc. of ICDM
Oh SB (2003) On the relationship between majority vote accuracy and dependency in multiple classifier systems. Pattern Recognition Letters 24: 359–363
Ortega J, Koppel M, Argamon S (2001) Arbitrating among competing classifiers using learned referees. Knowledge and Information Systems 3(4)
Puuronen S, Tsymbal A (2001) Local feature selection with dynamic integration of classifiers. Fundamenta Informaticae 47: 91–117
Quinlan R (1993) C4.5 programs for machine learning. San Mateo, CA, Morgan Kaufmann Publisher
Schapire R (1990) The strength of weak learnability. Machine Learning 5(2): 197–227
Schaffer C (1993) Selecting a classification method by cross-validation. Machine Learning 13: 135–143
Smits PC (2002) Multiple classifier systems for supervised remote sensing image classification based on dynamic classifier selection. IEEE Trans. Geosci. Remote Sensing 40(4)
Tsymbal A, Puuronen S (2000) Bagging and Boosting with dynamic integration of classifier. In: Proc. of PKDD 2000, Lyon, France. Lecture Notes in artificial intelligence, vol 1910. Springer Verlag, pp 116–125
Ueda N (2000) Optimal linear combination of neural networks for improving classification perform. IEEE Trans. PAMI 22
Verikas A, Lipnickas A, Malmqvist K, Bacauskiene M, Gelzinis A (1999) Soft combination of neural classifiers: A comparative study. Pattern Recognition Letter 20(4):429–444
Wang H, Fan W, Yu P, Han J (2003) Mining concept-drifting data streams using ensemble classifiers. In: Proc. of KDD
Woods K, Kegelmeyer WP, Bowyer K (1997) Combination of multiple classifiers using local accuracy estimation. IEEE Trans. PAMI 19(4)
Xu L, Krzyak A, Suen C (1992) Methods of combining multiple classifiers and their application to handwriting recognition. IEEE Trans. Sys, Man and Cyber 22
Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: Prof. of 20th ICML Conf. Washington DC
Zhu X, Wu X (2004) Class noise vs attribute noise: A quantitative study of their impacts. Artificial Intelligence Review 22(3/4): 177–210
Author information
Authors and Affiliations
Additional information
A preliminary version of this paper was published in the Proceedings of the 4th IEEE International Conference on Data Mining, pp 305–312, Brighton, UK
Xingquan Zhu received his Ph.D. degree in Computer Science from Fudan University, Shanghai, China, in 2001. He spent four months with Microsoft Research Asia, Beijing, China, where he was working on content-based image retrieval with relevance feedback. From 2001 to 2002, he was a Postdoctoral Associate in the Department of Computer Science, Purdue University, West Lafayette, IN. He is currently a Research Assistant Professor in the Department of Computer Science, University of Vermont, Burlington, VT. His research interests include Data mining, machine learning, data quality, multimedia computing, and information retrieval. Since 2000, Dr. Zhu has published extensively, including over 40 refereed papers in various journals and conference proceedings.
Xindong Wu is a Professor and the Chair of the Department of Computer Science at the University of Vermont. He holds a Ph.D. in Artificial Intelligence from the University of Edinburgh, Britain. His research interests include data mining, knowledge-based systems, and Web information exploration. He has published extensively in these areas in various journals and conferences, including IEEE TKDE, TPAMI, ACM TOIS, IJCAI, ICML, KDD, ICDM, and WWW, as well as 11 books and conference proceedings. Dr. Wu is the Editor-in-Chief of the IEEE Transactions on Knowledge and Data Engineering (by the IEEE Computer Society), the founder and current Steering Committee Chair of the IEEE International Conference on Data Mining (ICDM), an Honorary Editor-in-Chief of Knowledge and Information Systems (by Springer), and a Series Editor of the Springer Book Series on Advanced Information and Knowledge Processing (AI&KP). He is the 2004 ACM SIGKDD Service Award winner.
Ying Yang received her Ph.D. in Computer Science from Monash University, Australia in 2003. Following academic appointments at the University of Vermont, USA, she currently holds a Research Fellow at Monash University, Australia. Dr. Yang is recognized for contributions in the fields of machine learning and data mining. She has published many scientific papers and book chapters on adaptive learning, proactive mining, noise cleansing and discretization. Contact her at yyang@mail.csse.monash.edu.au.
Rights and permissions
About this article
Cite this article
Zhu, X., Wu, X. & Yang, Y. Effective classification of noisy data streams with attribute-oriented dynamic classifier selection. Knowl Inf Syst 9, 339–363 (2006). https://doi.org/10.1007/s10115-005-0212-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-005-0212-y