Skip to main content

On Robustness of Adaptive Random Forest Classifier on Biomedical Data Stream

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2020)

Abstract

Data Stream represents a significant challenge for data analysis and data mining techniques because those techniques are developed based on training batch data. Classification technique that deals with data stream should have the ability for adapting its model for the new samples and forget the old ones. In this paper, we present an intensive comparison for the performance of six of popular classification techniques and focusing on the power of Adaptive Random Forest. The comparison was made based on four real medical datasets and for more reliable results, 40 other datasets were made by adding white noise to the original datasets. The experimental results showed the dominant of Adaptive Random Forest over five other techniques with high robustness against the change in data and noise.

A. Kiss was also with J. Selye University, Komarno, Slovakia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hulten, G., Spencer, L., Domingos, P.: Mining time-changing data streams. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 97–106. ACM (2001)

    Google Scholar 

  2. Gantz, J., Reinsel, D.: The digital universe in 2020: big data, bigger digital shadows, and biggest growth in the far east. IDC iView: IDC Analyze the future 2007(2012), 1–16 (2012)

    Google Scholar 

  3. Gama, J.: Knowledge Discovery from Data Streams. Chapman and Hall/CRC, London (2010)

    Book  Google Scholar 

  4. Krempl, G., et al.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)

    Article  Google Scholar 

  5. Babenko, B., Yang, M.-H., Belongie, S.: A family of online boosting algorithms. In: 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pp. 1346–1353. IEEE (2009)

    Google Scholar 

  6. Bifet, A., Holmes, G., Pfahringer, B., Gavaldà, R.: Improving adaptive bagging methods for evolving data streams. In: Zhou, Z.-H., Washio, T. (eds.) ACML 2009. LNCS (LNAI), vol. 5828, pp. 23–37. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-05224-8_4

    Chapter  Google Scholar 

  7. Fong, S., et al.: Stream-based biomedical classification algorithms for analyzing biosignals. J. Inf. Process. Syst. 7(4), 717–732 (2011)

    Article  Google Scholar 

  8. Hang, Y., et al.: Case-based and stream-based classification in biomedical application. In: Eighth IASTED International Conference on Biomedical Engineering (Biomed 2011), pp. 207–214. February 2011

    Google Scholar 

  9. Zhang, Y., et al.: Real-time clinical decision support system with data stream mining. In: BioMed Research International 2012 (2012)

    Google Scholar 

  10. Cazzolato, M.T., Ribeiro, M.X.: A statistical decision tree algorithm for medical data stream mining. In: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems, pp. 389–392. IEEE (2013)

    Google Scholar 

  11. Zhu, M., et al.: Class weights random forest algorithm for processing class imbalanced medical data. IEEE Access 6, 4641–4652 (2018)

    Article  Google Scholar 

  12. Al-Shammari, A., Zhou, R., Liu, C., Naseriparsa, M., Vo, B.Q.: A framework for processing cumulative frequency queries over medical data streams. In: Hacid, H., Cellary, W., Wang, H., Paik, H.-Y., Zhou, R. (eds.) WISE 2018. LNCS, vol. 11234, pp. 121–131. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-02925-8_9

    Chapter  Google Scholar 

  13. Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics. vol. 3, pp. 2340–2345, IEEE (2005)

    Google Scholar 

  14. Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 291–300. IEEE (2016)

    Google Scholar 

  15. Salperwyck, C., Lemaire, V., Hue, C.: Incremental weighted naive bays classifiers for data stream. In: Lausen, B., Krolak-Schwerdt, S., Böhmer, M. (eds.) Data Science, Learning by Latent Structures, and Knowledge Discovery. SCDAKO, pp. 179–190. Springer, Heidelberg (2015). https://doi.org/10.1007/978-3-662-44983-7_16

    Chapter  Google Scholar 

  16. Domingos, P., Hulten, G.: Mining high-speed data streams. In: Kdd. vol. 2, p. 4 (2000)

    Google Scholar 

  17. Irvine UC: Machine Learning Repository. July 2019. url: https://archive.ics.uci.edu/ml/index.php

  18. kaggle Rebosotiry: Public Datasets. July 2019. url: https://www.kaggle.com/datasets

Download references

Acknowledgment

The project was supported by the European Union, co-financed by the European Social Fund (EFOP-3.6.3-VEKOP-16-2017-00002).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Attila Kiss .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Fatlawi, H.K., Kiss, A. (2020). On Robustness of Adaptive Random Forest Classifier on Biomedical Data Stream. In: Nguyen, N., Jearanaitanakij, K., Selamat, A., Trawiński, B., Chittayasothorn, S. (eds) Intelligent Information and Database Systems. ACIIDS 2020. Lecture Notes in Computer Science(), vol 12033. Springer, Cham. https://doi.org/10.1007/978-3-030-41964-6_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-41964-6_29

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-41963-9

  • Online ISBN: 978-3-030-41964-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics