Skip to main content

MHFlexDT: A Multivariate Branch Fuzzy Decision Tree Data Stream Mining Strategy Based on Hybrid Partitioning Standard

  • Conference paper
  • First Online:
Advances in Neural Networks – ISNN 2018 (ISNN 2018)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 10878))

Included in the following conference series:

  • 3804 Accesses

Abstract

Because of the inability to take a multi-pass scanning algorithm for random access to fast data streams and traditional data mining algorithms can’t sample all samples of the data stream, research of data stream mining algorithm based on fuzzy decision tree theory that fuzzy decision tree combines the understandability of decision tree and the ability of representation of fuzzy set to deal with the fuzziness and uncertainty information is very valuable to improve the accuracy of data mining. This paper presents a fuzzy decision tree data mining strategy based on hybrid partitioning standard for the problem that the method has a low accuracy when we deal with low-membership samples with missing values by dividing the samples into leaf nodes according to their membership. The multivariate branch fuzzy decision tree data stream mining strategy based on hybrid partitioning standard(MHFlexDT) is used to construct the multivariate branch fuzzy tree structure. The data fitting problem is solved by adding temporary branches to the uncertain data. At the same time, the decision tree depth is effectively limited by using the McDiarmid bound threshold. The experimental results show that MHFlexDT strategy compared with fuzzy decision tree data mining strategy is more effective in large-scale data stream mining to reduce system computation, control decision tree depth, and ensure a high accuracy when we deal with missing values, data over-fitting and noisy data problems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wei G.: Research & Development of Distributed Stream Real-time Computing Framework. Zhejiang Sci-Tech University (2013)

    Google Scholar 

  2. Han, J., Kamber, M.: Data Mining: Concepts and Techniques. Data Mining Concepts Models Methods & Algorithms Second Edition 5(4), 1–18 (2011)

    Google Scholar 

  3. Rokach, L., Maimon, L.: Data Mining with Decision Trees: Theory and Applications. World Scientific Publishing Company, Singapore (2014)

    Book  Google Scholar 

  4. Yao Y., Zhang J.X., Xu G.K.: The Methods of Dynamic Data Stream Classification and Its Applications in Data Mining of Ethnic Information, pp. 43–45. Publishing House of Electronics Industry, Beijing (2014)

    Google Scholar 

  5. Jaworski, M., Rutkowski, L., Pawlak, M.: Hybrid splitting criterion in decision trees for data stream mining. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, Lotfi A., Zurada, Jacek M. (eds.) ICAISC 2016. LNCS (LNAI), vol. 9693, pp. 60–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-39384-1_6

    Chapter  Google Scholar 

  6. Gama, J., Fernandes, R., Rocha, R.: Decision trees for mining data streams. Intell. Data Anal. 10(1), 23–45 (2006)

    Google Scholar 

  7. Yang, H., Fong, S.: Moderated VFDT in stream mining using adaptive tie threshold and incremental pruning. In: Cuzzocrea, A., Dayal, U. (eds.) DaWaK 2011. LNCS, vol. 6862, pp. 471–483. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-23544-3_36

    Chapter  Google Scholar 

  8. Anagnostopoulos, C., Tasoulis, D.K., Adams, N.M., et al.: Temporally adaptive estimation of logistic classifiers on data streams. Adv. Data Anal. Classif. 3(3), 243–261 (2009)

    Article  MathSciNet  Google Scholar 

  9. Hulten G., Spencer L., Domingos, P.: Mining Time-changing Data Streams. In: ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106 (2001)

    Google Scholar 

  10. Hashemi, S., Yang, Y.: Flexible decision tree for data stream classification in the presence of concept change, noise and missing values. Data Min. Knowl. Disc. 19(1), 95–131 (2009)

    Article  MathSciNet  Google Scholar 

  11. Kuncheva, Ludmila I.: Classifier ensembles for changing environments. In: Roli, F., Kittler, J., Windeatt, T. (eds.) MCS 2004. LNCS, vol. 3077, pp. 1–15. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-25966-4_1

    Chapter  Google Scholar 

  12. Wang, T., Li, Z., Yan, Y., Chen, H.: An incremental fuzzy decision tree classification method for mining data streams. In: Perner, P. (ed.) MLDM 2007. LNCS (LNAI), vol. 4571, pp. 91–103. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73499-4_8

    Chapter  Google Scholar 

  13. Isazadeh, A., Mahan, F., Pedrycz, W.: MFlexDT: multi flexible fuzzy decision tree for data stream classification. Soft. Comput. 20(9), 3719–3733 (2016)

    Article  Google Scholar 

  14. Rutkowski, L., Pietruczuk, L., Duda, P., et al.: Decision trees for mining data streams based on the McDiarmid. IEEE Trans. Knowl. Data Eng. 25(6), 1272–1279 (2013)

    Article  Google Scholar 

  15. Rutkowski, L., Jaworski, M., Pietruczuk, L., et al.: The CART decision tree for mining data streams. Inf. Sci. 266(5), 1–15 (2014)

    Article  Google Scholar 

  16. Matuszyk, P., Krempl, G., Spiliopoulou, M.: Correcting the usage of the hoeffding inequality in stream mining. In: Tucker, A., Höppner, F., Siebes, A., Swift, S. (eds.) IDA 2013. LNCS, vol. 8207, pp. 298–309. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41398-8_26

    Chapter  Google Scholar 

Download references

Acknowledgment

The research work was supported by the National Natural Science Foundation of China under Grant No. 61403069 and 61603083, the Fundamental Research Funds of the Central Universities under Grant No. N162304009, the Major Project of Science and Technology Research of Hebei University under Grant No. ZD2017303.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xin Song .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer International Publishing AG, part of Springer Nature

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Song, X., Wang, H., He, H., Meng, Y. (2018). MHFlexDT: A Multivariate Branch Fuzzy Decision Tree Data Stream Mining Strategy Based on Hybrid Partitioning Standard. In: Huang, T., Lv, J., Sun, C., Tuzikov, A. (eds) Advances in Neural Networks – ISNN 2018. ISNN 2018. Lecture Notes in Computer Science(), vol 10878. Springer, Cham. https://doi.org/10.1007/978-3-319-92537-0_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-92537-0_36

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-92536-3

  • Online ISBN: 978-3-319-92537-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics