Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Hashemi, Sattar; Yang, Ying

doi:10.1007/s10618-009-0130-9

Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Published: 17 March 2009

Volume 19, pages 95–131, (2009)
Cite this article

Data Mining and Knowledge Discovery Aims and scope Submit manuscript

Sattar Hashemi¹ &
Ying Yang²

749 Accesses
47 Citations
Explore all metrics

Abstract

In recent years, classification learning for data streams has become an important and active research topic. A major challenge posed by data streams is that their underlying concepts can change over time, which requires current classifiers to be revised accordingly and timely. To detect concept change, a common methodology is to observe the online classification accuracy. If accuracy drops below some threshold value, a concept change is deemed to have taken place. An implicit assumption behind this methodology is that any drop in classification accuracy can be interpreted as a symptom of concept change. Unfortunately however, this assumption is often violated in the real world where data streams carry noise that can also introduce a significant reduction in classification accuracy. To compound this problem, traditional noise cleansing methods are incompetent for data streams. Those methods normally need to scan data multiple times whereas learning for data streams can only afford one-pass scan because of data’s high speed and huge volume. Another open problem in data stream classification is how to deal with missing values. When new instances containing missing values arrive, how a learning model classifies them and how the learning model updates itself according to them is an issue whose solution is far from being explored. To solve these problems, this paper proposes a novel classification algorithm, flexible decision tree (FlexDT), which extends fuzzy logic to data stream classification. The advantages are three-fold. First, FlexDT offers a flexible structure to effectively and efficiently handle concept change. Second, FlexDT is robust to noise. Hence it can prevent noise from interfering with classification accuracy, and accuracy drop can be safely attributed to concept change. Third, it deals with missing values in an elegant way. Extensive evaluations are conducted to compare FlexDT with representative existing data stream classification algorithms using a large suite of data streams and various statistical tests. Experimental results suggest that FlexDT offers a significant benefit to data stream classification in real-world scenarios where concept change, noise and missing values coexist.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

Article 09 November 2022

A survey on ensemble learning

Article 30 August 2019

References

Basak J (2006) Online adaptive decision trees: pattern classification and function approximation. Neural Comput 18(9): 2062–2101
Article MATH MathSciNet Google Scholar
Bhatt RB, Gopal M (2006) Neuro-fuzzy decision trees. Int J Neural Syst 16(1): 63–78
Article Google Scholar
Chan P, Dunn OJ (1972) The treatment of missing values in discriminant analysis. J Am Stat Assoc (6):473–477
Cohen W (1995) Fast effective rule induction. In: Proceedings of the 12th international conference on machine learning (ICML), pp 115–123
Demsar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7: 1–30
MathSciNet Google Scholar
Domingos P, Hulten G (2000) Mining high speed data streams. In: Proceedings of the 6th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 71–80
Fan W (2004) Systematic data selection to mine concept-drifting data streams. In: Proceedings of the 10th ACM international conference on knowledge discovery and data mining (SIGKDD), pp 128–137
Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: 13th international joint conference of artificial intelligence, pp 1022–1027
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32: 675–701
Article Google Scholar
Friedman M (1940) A comparison of alternative tests of significance for the problem of m rankings. Ann Math Stat 11: 86–92
Article MATH Google Scholar
Hashemi S, Yang Y, Pourkashani M, Kangavari M (2007) To better handle concept change and noise: a cellular automata approach to data stream classification. In: Australian joint conference on artificial intelligence, pp 669–674
Haykin S (1994) Neural networks: a comprehensive foundation. Prentice Hall PTR, Upper Saddle River, NJ, USA
MATH Google Scholar
Ho SS (2005) A martingale framework for concept change detection in time-varying data streams. In: Proceedings of the 22nd international conference on machine learning (ICML), pp 321–327
Hulten G, Spencer L, Domingos P (2001) Mining time-changing data streams. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 97–106
Jang J-SR (1993) ANFIS: adaptive-network-based fuzzy inference system. IEEE Trans Syst Man Cybern 23: 665–684
Article Google Scholar
Janikow CZ (1998) Fuzzy decision trees: issues and methods. IEEE Trans Syst Man Cybern B Cybern 28(1): 1–14
Article Google Scholar
Janikow CZ, Kawa K (2005) Fuzzy decision tree fid. In: Annual meeting of the north American fuzzy information processing society, IEEE, pp 379–384
Kolter JZ, Maloof MA (2003) Dynamic weighted majority: a new ensemble method for tracking concept drift. In: Proceedings of the 3rd IEEE international conference on data mining (ICDM), p 123
Maher PE, Clair DCS (1993) Uncertain reasoning in an id3 machine learning framework. In: 2nd IEEE international conference on fuzzy systems, pp 7–12
Mitchell TM (1997) Machine learning. McGraw Hill
Mitra S, Konwar KM, Pal SK (2002) Fuzzy decision tree, linguistic rules and fuzzy knowledge-based network: generation and evaluation. IEEE Trans Syst Man Cybern C Appl Rev 32(4): 328–339
Article Google Scholar
Mundfrom DJ, Whitcomb A (1998) Imputing missing values: The effect on the accuracy of classification. Multiple Linear Regre Viewp 25(1): 13–19
Google Scholar
Newman DJ, Hettich S, Blake C, Merz C (1998) UCI repository of machine learning databases
Olaru C, Wehenkel L (2003) A complete fuzzy decision tree technique. Fuzzy Sets Syst 138: 221–254
Article MathSciNet Google Scholar
Quinlan JR (1993) C4.5: Programs for machine learning. Morgan Kaufmann Publishers
Quinlan JR (1993) Induction of decision trees, pp 349–361
Saar-Tsechansky M, Provost F (2007) Handling missing values when applying classification models. J Mach Learn Res 8: 1625–1657
Google Scholar
Street WN, Kim Y (2001) A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the 7th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 377–382
Tsymbal A (2004) The problem of concept drift: definitions and related work. Technical report TCD-CS-2004-15, Computer Science Department, Trinity College Dublin, Ireland
Umanol M, Okamoto H, Hatono I, Tamura H, Kawachi F, Umedzu S, Kinoshita J (1994) Fuzzy decision trees by fuzzy id3 algorithm and its application to diagnosis systems. In: IEEE world congress on computational intelligence, pp 2113–2118
Wang H, Fan W, Yu PS, Han J (2003) Mining concept drifting data streams using ensemble classifiers. In: Proceedings of the 9th ACM SIGKDD international conference on knowledge discovery and data mining (SIGKDD), pp 226–235
Wang P, Wang H, Wu X, Wang W, Shi B (2005) On reducing classifier granularity in mining concept-drifting data streams. In: Proceedings of the 5th IEEE international conference on data mining (ICDM), pp 474–481
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Machine Learn 23: 69–101
Google Scholar
Yang Y, Wu X, Zhu X (2005) Combining proactive and reactive predictions for data streams. In: Proceedings of the 11th ACM international conference on knowledge discovery and data mining (SIGKDD), ACM Press, pp 710–715
Yang Y, Wu X, Zhu X (2006) Mining in anticipation for concept change: proactive-reactive prediction in data streams. Data Mining Knowl Discov 13(3): 261–289
Article MathSciNet Google Scholar
Zhu X, Wu X (2004) Class noise vs. attribute noise: a quantitative study. Artif Intell Rev 22(3): 177–210
Article MATH MathSciNet Google Scholar
Zhu X, Wu X, Chen Q (2003) Eliminating class noise in large datasets. In: Proceedings of the 20th internationl conference in machine learning (ICML), pp 920–927
Zhu X, Wu X, Yang Y (2004) Dynamic classifier selection for effective mining from noisy data streams. In: Proceedings of the 4th IEEE international conference on data mining (ICDM), pp 305–312
Zhu X, Wu X, Yang Y (2006) Effective classification of noisy data streams with attribute-oriented dynamic classifier selection. Knowl Inf Syst 9(3): 339–363
Article MathSciNet Google Scholar
Zimmermann HJ (2001) Fuzzy set theory and its applications. Kluwer Academic Publishers, Dordrecht, Boston
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Sciences, Shiraz University, Shiraz, Iran
Sattar Hashemi
Australian Taxation Office, Melbourne, VIC, Australia
Ying Yang

Authors

Sattar Hashemi
View author publications
You can also search for this author in PubMed Google Scholar
Ying Yang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Yang.

Additional information

Responsible editor: Bart Goethals.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hashemi, S., Yang, Y. Flexible decision tree for data stream classification in the presence of concept change, noise and missing values. Data Min Knowl Disc 19, 95–131 (2009). https://doi.org/10.1007/s10618-009-0130-9

Download citation

Received: 08 February 2008
Accepted: 03 March 2009
Published: 17 March 2009
Issue Date: August 2009
DOI: https://doi.org/10.1007/s10618-009-0130-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Flexible decision tree for data stream classification in the presence of concept change, noise and missing values

Abstract

Access this article

Similar content being viewed by others

A Systematic Review on Supervised and Unsupervised Machine Learning Algorithms for Data Science

Imbalanced data preprocessing techniques for machine learning: a systematic mapping study

A survey on ensemble learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation