Skip to main content

Soft Voting Windowing Ensembles for Learning from Partially Labelled Streams

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11948))

Abstract

Mining data streams has become an important topic due to the increased availability of vast amounts of online data. In such incremental learning scenarios, observations arrive in a sequence over time and are subject to changes in data distributions, also known as concept drifts. Interleaved test-then-train evaluations are often used during supervised learning from streaming data. The idea is intuitive: we first use each instance to test a model, then it is used for training. However, true class labels may be missing or arrive well after the prediction, which implies that they cannot be used for training and/or drift detection. Based on these considerations, we introduce our LESS-TWE ensemble-based method for online learning in domains where full reliance on labels would be unfeasible. Our approach combines weighted soft voting and unsupervised drift detection to reduce the dependency on labels during model construction. In cases where the label is unavailable, the most confident label, as predicted through weighted soft voting, is selected. Similarly, our unlabelled drift detector flags for drifts based on the voting confidence, rather than relying on the true label. Our experimental evaluation indicates that our algorithm is very fast, achieves comparable predictive accuracy when compared to the state-of-the-art and outperforms baseline methods.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://github.com/SeanLF/scikit-multiflow.

References

  1. Bifet, A., Gavalda, R.: Learning from time-changing data with adaptive windowing. In: Proceedings of the 2007 SIAM International Conference on Data Mining, pp. 443–448. SIAM (2007)

    Google Scholar 

  2. Bifet, A., Holmes, G., Pfahringer, B.: Leveraging bagging for evolving data streams. In: Balcázar, J.L., Bonchi, F., Gionis, A., Sebag, M. (eds.) ECML PKDD 2010. LNCS (LNAI), vol. 6321, pp. 135–150. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15880-3_15

    Chapter  Google Scholar 

  3. Bifet, A., et al.: New ensemble methods for evolving data streams. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 139–148. ACM (2009)

    Google Scholar 

  4. Creech, G., Hu, J.: A semantic approach to host-based intrusion detection systems using contiguous and discontinuous system call patterns. IEEE Trans. Comput. 63, 807–819 (2014)

    Article  MathSciNet  Google Scholar 

  5. D’Ettorre, S., Viktor, H.L., Paquet, E.: Context-based abrupt change detection and adaptation for categorical data streams. In: Yamamoto, A., Kida, T., Uno, T., Kuboyama, T. (eds.) DS 2017. LNCS (LNAI), vol. 10558, pp. 3–17. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-67786-6_1

    Chapter  Google Scholar 

  6. Flach, P.: Machine Learning: The Art and Science of Algorithms that Make Sense of Data. Cambridge University Press, Cambridge (2012)

    Book  Google Scholar 

  7. Gama, J., Medas, P., Castillo, G., Rodrigues, P.: Learning with drift detection. In: Bazzan, A.L.C., Labidi, S. (eds.) SBIA 2004. LNCS (LNAI), vol. 3171, pp. 286–295. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-28645-5_29

    Chapter  Google Scholar 

  8. Haque, A., Khan, L., Baron, M.: Semi supervised adaptive framework for classifying evolving data stream. In: Cao, T., Lim, E.-P., Zhou, Z.-H., Ho, T.-B., Cheung, D., Motoda, H. (eds.) PAKDD 2015. LNCS (LNAI), vol. 9078, pp. 383–394. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-18032-8_30

    Chapter  Google Scholar 

  9. Japkowicz, N., Shah, M.: Evaluating Learning Algorithms: A Classification Perspective. Cambridge University Press, Cambridge (2011)

    Book  Google Scholar 

  10. Krawczyk, B., et al.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017). ISSN 1566-2535

    Article  Google Scholar 

  11. Krempl, G., et al.: Open challenges for data stream mining research. ACM SIGKDD Explor. Newsl. 16(1), 1–10 (2014)

    Article  Google Scholar 

  12. Nishida, K., Yamauchi, K.: Adaptive classifiers-ensemble system for tracking concept drift. In: 2007 International Conference on Machine Learning and Cybernetics, vol. 6, pp. 3607–3612. IEEE (2007)

    Google Scholar 

  13. Pesaranghader, A., Viktor, H., Paquet, E.: Reservoir of diverse adaptive learners and stacking fast Hoeffding drift detection methods for evolving data streams. Mach. Learn. 107(11), 1711–1743 (2018). https://doi.org/10.1007/s10994-018-5719-z

    Article  MathSciNet  MATH  Google Scholar 

  14. Sobolewski, P., Wozniak, M.: Concept drift detection and model selection with simulated recurrence and ensembles of statistical detectors. J. Univ. Comput. Sci. 19(4), 462–483 (2013)

    Google Scholar 

  15. Street, W.N., Kim, Y.S.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 377–382. ACM (2001)

    Google Scholar 

  16. Zhu, X., Goldberg, A.B.: Introduction to semi-supervised learning. Synth. Lect. Artif. Intell. Mach. Learn. 3(1), 1–130 (2009)

    Article  Google Scholar 

  17. Žliobaitė, I., Bifet, A., Read, J., Pfahringer, B., Holmes, G.: Evaluation methods and decision theory for classification of streaming data with temporal dependence. Mach. Learn. 98(3), 455–482 (2014). https://doi.org/10.1007/s10994-014-5441-4

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Herna L. Viktor .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Floyd, S.L.A., Viktor, H.L. (2020). Soft Voting Windowing Ensembles for Learning from Partially Labelled Streams. In: Ceci, M., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2019. Lecture Notes in Computer Science(), vol 11948. Springer, Cham. https://doi.org/10.1007/978-3-030-48861-1_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-48861-1_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-48860-4

  • Online ISBN: 978-3-030-48861-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics