Skip to main content

Classification in Non-stationary Environments Using Coresets over Sliding Windows

  • Conference paper
  • First Online:
Advances in Computational Intelligence (IWANN 2021)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 12861))

Included in the following conference series:

Abstract

In non-stationary environments, several constraints require algorithms to be fast, memory-efficient, and highly adaptable. While there are several classifiers of the family of lazy learners and tree classifiers in the streaming context, the application of prototype-based classifiers has not found much attention. Prototype-based classifiers however have some interesting characteristics, which are also useful in streaming environments, in particular being highly interpretable. Hence, we propose a new prototype-based classifier, which is based on Minimum Enclosing Balls over sliding windows. We propose this algorithm as a linear version as well as kernelized. Our experiments show, that this technique can be useful and is comparable in performance to another popular prototype-based streaming classifier – the Adaptive Robust Soft Learning Vector Quantization but with an additional benefit of having a configurable window size to catch rapidly changing drift and the ability to use the internal mechanics for drift detection.

M. Heusinger—Supported by StMWi, project OBerA, grant number IUK-1709-0011// IUK530/010.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Implementation can be found on https://github.com/foxriver76/meb-classifier.

  2. 2.

    Experiments can be found on https://github.com/foxriver76/meb-classifier.

  3. 3.

    https://www.kaggle.com/c/GiveMeSomeCredit.

References

  1. Bifet, A., Gavaldà,R., Holmes, G., Pfahringer, B.: Machine Learning for Data Streams with Practical Examples in MOA. MIT Press (2018). https://moa.cms.waikato.ac.nz/book/

  2. Atzori, L., Iera, A., Morabito, G.: The internet of things: a survey. Comput. Netw. 54(15), 2787–2805 (2010)

    Article  Google Scholar 

  3. Gama, J., Zliobaite, I., Bifet, A., Pechenizkiy, M., Bouchachia, A.: A survey on concept drift adaptation. ACM Comput. Surv. 46(4), 1–37 (2014)

    Article  Google Scholar 

  4. Heusinger, M., Raab, C., Schleif, F.-M.: Passive concept drift handling via momentum based robust soft learning vector quantization. In: Vellido, A., Gibert, K., Angulo, C., Martín Guerrero, J.D. (eds.) WSOM 2019. AISC, vol. 976, pp. 200–209. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-19642-4_20

    Chapter  Google Scholar 

  5. Straat, M., Abadi, F., Göpfert, C., Hammer, B., Biehl, M.: Statistical mechanics of on-line learning under concept drift. Entropy 20(10) (2018)

    Google Scholar 

  6. Wang, Y., Li, Y., Tan, K.-L.: Coresets for minimum enclosing balls over sliding windows. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ser. KDD 2019, New York, NY, USA, pp. 314–323. Association for Computing Machinery (2019)

    Google Scholar 

  7. Losing, V., Hammer, B., Wersing, H.: KNN classifier with self adjusting memory for heterogeneous concept drift. In: Proceedings - IEEE, ICDM, pp. 291–300 (2017)

    Google Scholar 

  8. Heusinger, M., Schleif, F.: Random projection in supervised non-stationary environments. In: 28th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN 2020, Bruges, Belgium, 2–4 October 2020, pp. 405–410 (2020). https://www.esann.org/sites/default/files/proceedings/2020/ES2020-13.pdf

  9. Heusinger, M., Raab, C., Schleif, F.: Analyzing dynamic social media data via random projection - a new challenge for stream classifiers. In: IEEE Conference on Evolving and Adaptive Intelligent Systems (EAIS) 2020, pp. 1–8 (2020)

    Google Scholar 

  10. Bifet, A., Gavaldà, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the Seventh SIAM International Conference on Data Mining, Minneapolis, Minnesota, USA, 26–28 April 2007, pp. 443–448 (2007)

    Google Scholar 

  11. Raab, C., Heusinger, M., Schleif, F.-M.: Reactive soft prototype computing for frequent reoccurring concept drift. In: Proceedings of the 27. ESANN 2019, pp. 437–442 (2019)

    Google Scholar 

  12. Raab, C., Heusinger, M., Schleif, F.-M.: Reactive soft prototype computing for concept drift streams. Neurocomputing (2020)

    Google Scholar 

  13. Domingos, P.M., Hulten, G.: Mining high-speed data streams. In: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Boston, MA, USA, 20–23 August 2000, pp. 71–80 (2000)

    Google Scholar 

  14. Bifet, A., et al.: Extremely fast decision tree mining for evolving data streams. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Halifax, NS, Canada, 13–17 August 2017, pp. 1733–1742. ACM (2017)

    Google Scholar 

  15. Bifet, A., Gavaldà, R.: Adaptive learning from evolving data streams. In: Adams, N.M., Robardet, C., Siebes, A., Boulicaut, J.-F. (eds.) IDA 2009. LNCS, vol. 5772, pp. 249–260. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-03915-7_22

    Chapter  Google Scholar 

  16. Gomes, H.M., Barddal, J.P., Enembreck, F., Bifet, A.: A survey on ensemble learning for data stream classification. ACM Comput. Surv. 50(2), 23:1-23:36 (2017)

    Article  Google Scholar 

  17. Oza, N.C.: Online bagging and boosting. In: 2005 IEEE International Conference on Systems, Man and Cybernetics, vol. 3, pp. 2340–2345 (2005)

    Google Scholar 

  18. Kohonen, T.: Learning vector quantization. In: Self-Organizing Maps. Springer Series in Information Sciences, vol. 30, pp. 175–189. Springer, Heidelberg (1995). https://doi.org/10.1007/978-3-642-97610-0_6

  19. Heusinger, M., Schleif, F.: Reactive concept drift detection using coresets over sliding windows. In: 2020 IEEE Symposium Series on Computational Intelligence, SSCI 2020, Canberra, Australia, 1–4 December 2020, pp. 1350–1355. IEEE (2020). https://doi.org/10.1109/SSCI47803.2020.9308521

  20. Baena-Garcıa, M., del Campo-Ávila, J., Fidalgo, R., Bifet, A., Gavalda, R., Morales-Bueno, R.: Early drift detection method. In: Fourth International Workshop on Knowledge Discovery from Data Streams, vol. 6, pp. 77–86 (2006)

    Google Scholar 

  21. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  22. Ren, J., Ma, R., Ren, J.: Density-based data streams clustering over sliding windows. In: Proceedings of the 6th International Conference on Fuzzy Systems and Knowledge Discovery - Volume 5, ser. FSKD 2009, pp. 248–252. IEEE Press (2009)

    Google Scholar 

  23. Zarrabi-Zadeh, H., Chan, T.M.: A simple streaming algorithm for minimum enclosing balls. In: CCCG (2006)

    Google Scholar 

  24. Chan, T.M., Pathak, V.: Streaming and dynamic algorithms for minimum enclosing balls in high dimensions. In: Dehne, F., Iacono, J., Sack, J.-R. (eds.) WADS 2011. LNCS, vol. 6844, pp. 195–206. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-22300-6_17

    Chapter  Google Scholar 

  25. Agarwal, P.K., Har-Peled, S., Varadarajan, K.R.: Approximating extent measures of points. J. ACM 51(4), 606–635 (2004). https://doi.org/10.1145/1008731.1008736

    Article  Google Scholar 

  26. Chan, T.M.: Faster core-set constructions and data stream algorithms in fixed dimensions. In: Proceedings of the Twentieth Annual Symposium on Computational Geometry, ser. SCG 2004. New York, NY, USA, pp. 152–159. Association for Computing Machinery (2004). https://doi.org/10.1145/997817.997843

  27. Gomes, H.M., et al.: Adaptive random forests for evolving data stream classification. Mach. Learn. 106(9–10), 1469–1495 (2017)

    Article  Google Scholar 

  28. Schleif, F.-M., Tino, P.: Indefinite proximity learning: a review. Neural Comput. 27(10), 2039–2096 (2015). https://doi.org/10.1162/NECO_a_00770

    Article  PubMed  Google Scholar 

  29. Frénay, B., Verleysen, M.: Parameter-insensitive kernel in extreme learning for non-linear support vector regression. Neurocomputing 74(16), 2526–2531 (2011). https://doi.org/10.1016/j.neucom.2010.11.037

    Article  Google Scholar 

  30. Street, W.N., Kim, Y.: A streaming ensemble algorithm (sea) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ser. KDD 2001. New York, NY, USA, pp. 377–382. ACM (2001). http://doi.acm.org/10.1145/502512.502568

  31. Breiman, L., Friedman, J., Stone, C.J., Olshen, R.A.: Classification and Regression Trees. CRC Press, Boca Raton (1984)

    Google Scholar 

  32. Agrawal, R., Imielinski, T., Swami, A.: Database mining: a performance perspective. IEEE Trans. Knowl. Data Eng. 5(6), 914–925 (1993)

    Article  Google Scholar 

  33. Elwell, D., Klink, J., Holman, J., Sciarini, M.: Ongoing experience with ohios automatic weather station network. Appl. Eng. Agricult. 9, 437–441 (1993)

    Article  Google Scholar 

  34. Bifet, A., Pfahringer, B., Read, J., Holmes, G.: Efficient data stream classification via probabilistic adaptive windows. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, ser. SAC 2013. New York, NY, USA, pp. 801–806. ACM (2013). http://doi.acm.org/10.1145/2480362.2480516

  35. Montiel, J., Read, J., Bifet, A., Abdessalem, T.: Scikit-multiflow: a multi-output streaming framework. J. Mach. Learn. Res. 19(72), 1–5 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Moritz Heusinger .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Heusinger, M., Schleif, FM. (2021). Classification in Non-stationary Environments Using Coresets over Sliding Windows. In: Rojas, I., Joya, G., Català, A. (eds) Advances in Computational Intelligence. IWANN 2021. Lecture Notes in Computer Science(), vol 12861. Springer, Cham. https://doi.org/10.1007/978-3-030-85030-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85030-2_11

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85029-6

  • Online ISBN: 978-3-030-85030-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics