Skip to main content

Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access

  • Conference paper
  • First Online:
Artificial Intelligence and Soft Computing (ICAISC 2021)

Abstract

Real data streams often, in addition to the possibility of concept drift occurrence, can display a high imbalance ratio. Another important problem with real classification tasks, often overlooked in the literature, is the cost of obtaining labels. This work aims to connect three rarely combined research directions i.e., data stream classification, imbalanced data classification, and limited access to labels. For this purpose, the behavior of the desisc-sb framework proposed by the authors in earlier works for the classification of highly imbalanced data stream was examined under the scenario of limited label access. Experiments conducted on synthetic and real streams confirmed the potential of using desisc-sb to classify highly imbalanced data streams even in the case of low label availability.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/w4k2/icaisc21-al-stream.

References

  1. Bouguelia, M., Belaïd, Y., Belaïd, A.: An adaptive streaming active learning strategy based on instance weighting. Pattern Recogn. Lett. 70, 38–44 (2016)

    Article  Google Scholar 

  2. Ditzler, G., Polikar, R.: Incremental learning of concept drift from streaming imbalanced data. IEEE Trans. Knowl. Data Eng. 25(10), 2283–2301 (2013)

    Article  Google Scholar 

  3. Fernández, A., García, S., Galar, M., Prati, R.C., Krawczyk, B., Herrera, F.: Learning from Imbalanced Data Sets. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-98074-4

    Book  Google Scholar 

  4. Gama, J., Žliobaitė, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. (CSUR) 46(4), 1–37 (2014)

    Article  Google Scholar 

  5. Gomes, H.M., Barddal, J.P., Enembreck, F., Bifet, A.: A survey on ensemble learning for data stream classification. ACM Comput. Surv. (CSUR) 50(2), 1–36 (2017)

    Article  Google Scholar 

  6. Grzyb, J., Klikowski, J., Woźniak, M.: Hellinger distance weighted ensemble for imbalanced data stream classification. J. Comput. Sci. 51, 101314 (2021)

    Article  Google Scholar 

  7. Krawczyk, B., Pfahringer, B., Wozniak, M.: Combining active learning with concept drift detection for data stream mining. In: IEEE International Conference on Big Data, Big Data 2018, Seattle, WA, USA, 10–13 December 2018. pp. 2239–2244. IEEE (2018)

    Google Scholar 

  8. Ksieniewicz, P.: The prior probability in the batch classification of imbalanced data streams. Neurocomputing 452, 309–316 (2020)

    Article  Google Scholar 

  9. Ksieniewicz, P., Zyblewski, P.: Stream-learn-open-source python library for difficult data stream batch analysis. arXiv preprint arXiv:2001.11077 (2020)

  10. Mohamad, S., Sayed-Mouchaweh, M., Bouchachia, A.: Active learning for classifying data streams with unknown number of classes. Neural Netw. 98, 1–15 (2018)

    Article  Google Scholar 

  11. Settles, B.: Active Learning. Morgan & Claypool Publishers (2012)

    Google Scholar 

  12. Shan, J., Zhang, H., Liu, W., Liu, Q.: Online active learning ensemble framework for drifted data streams. IEEE Trans. Neural Netw. Learn. Syst. 30(2), 486–498 (2019)

    Article  Google Scholar 

  13. de Souza, V.M.A., Silva, D.F., Batista, G.E.A.P.A.: Classification of data streams applied to insect recognition: initial results. In: 2013 Brazilian Conference on Intelligent Systems, pp. 76–81 (2013). https://doi.org/10.1109/BRACIS.2013.21

  14. Sun, Y., Tang, K., Minku, L.L., Wang, S., Yao, X.: Online ensemble learning of data streams with gradually evolved classes. IEEE Trans. Knowl. Data Eng. 28(6), 1532–1545 (2016)

    Article  Google Scholar 

  15. Wang, S., Minku, L.L., Yao, X.: A systematic study of online class imbalance learning with concept drift. CoRR abs/1703.06683 (2017)

    Google Scholar 

  16. Wang, Y., Zhang, Y., Wang, Y.: Mining data streams with skewed distribution by static classifier ensemble. In: Chien, B.C., Hong, T.P. (eds.) Opportunities and Challenges for Next-Generation Applied Intelligence. Studies in Computational Intelligence, vol 214, pp. 65–71. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-540-92814-0_11

  17. Zhang, H., Liu, W., Liu, Q.: Reinforcement online active learning ensemble for drifting imbalanced data streams. IEEE Trans. Knowl. Data Eng. (2020)

    Google Scholar 

  18. Zyblewski, P., Ksieniewicz, P., Woźniak, M.: Combination of active and random labeling strategy in the non-stationary data stream classification. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds.) ICAISC 2020. LNCS (LNAI), vol. 12415, pp. 576–585. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-61401-0_54

    Chapter  Google Scholar 

  19. Zyblewski, P., Sabourin, R., Woźniak, M.: Preprocessed dynamic classifier ensemble selection for highly imbalanced drifted data streams. Inf. Fusion 66, 138–154 (2021)

    Article  Google Scholar 

Download references

Acknowledgment

This work was supported by the Polish National Science Centre under the grant No. 2017/27/B/ST6/01325.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paweł Zyblewski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zyblewski, P., Woźniak, M. (2021). Dynamic Ensemble Selection for Imbalanced Data Stream Classification with Limited Label Access. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2021. Lecture Notes in Computer Science(), vol 12855. Springer, Cham. https://doi.org/10.1007/978-3-030-87897-9_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-87897-9_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-87896-2

  • Online ISBN: 978-3-030-87897-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics