Skip to main content

Semi-supervised Ensemble Learning of Data Streams in the Presence of Concept Drift

  • Conference paper
Hybrid Artificial Intelligent Systems (HAIS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7209))

Included in the following conference series:

Abstract

Increasing access to very large and non-stationary datasets in many real problems has made the classical data mining algorithms impractical and made it necessary to design new online classification algorithms. Online learning of data streams has some important features, such as sequential access to the data, limitation on time and space complexity and the occurrence of concept drift. The infinite nature of data streams makes it hard to label all observed instances. It seems that using the semi-supervised approaches have much more compatibility with the problem. So in this paper we present a new semi-supervised ensemble learning algorithm for data streams. This algorithm uses the majority vote of learners for the labeling of unlabeled instances. The empirical study demonstrates that the proposed algorithm is comparable with the state-of-the-art semi-supervised online algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Tsymbal, A.: The Problem of Concept Drift: Definitions and Related Work (2004)

    Google Scholar 

  2. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)

    Google Scholar 

  3. Aha, D.W., Kibler, D., Albert, M.K.: Instance-Based Learning Algorithms. Mach. Learn. 6(1), 37–66 (1991)

    Google Scholar 

  4. Salganicoff, M.: Density-Adaptive Learning and Forgetting. In: Tenth International Conference on Machine Learning. Morgan Kaufmann (1993)

    Google Scholar 

  5. Zliobaite, I.: Learning under Concept Drift: an Overview (2010)

    Google Scholar 

  6. Li, P., Wu, X., Hu, X.: Mining Recurring Concept Drifts with Limited Labeled Streaming Data. In: 2nd Asian Conference on Machine Learning (ACML 2010). JMLR, Tokyo (2010)

    Google Scholar 

  7. Masud, M.M.: Adaptive Classification of Scarcely Labeled and Evolving Data Streams, in Computer Science, p. 161. The University of Texas, Dallas (2009)

    Google Scholar 

  8. Klinkenberg, R.: Using Labeled and Unlabeled Data to Learn Drifting Concepts. In: IJCAI 2001 Workshop on Learning from Temporal and Spatial Data. AAAI Press, Menlo Park (2001)

    Google Scholar 

  9. Borchani, H., Larrañaga, P., Bielza, C.: Mining Concept-Drifting Data Streams Containing Labeled and Unlabeled Instances. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010, Part I. LNCS, vol. 6096, pp. 531–540. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  10. Zhang, P., Zhu, X., Guo, L.: Mining Data Streams with Labeled and Unlabeled Training Examples. In: Proceedings of the 2009 Ninth IEEE International Conference on Data Mining. IEEE Computer Society (2009)

    Google Scholar 

  11. Widyantoro, D.H., Yen, J.: Relevant data expansion for learning concept drift from sparsely labeled data. IEEE Transactions on Knowledge and Data Engineering 17(3), 401–412 (2005)

    Article  Google Scholar 

  12. Woolam, C., Masud, M.M., Khan, L.: Lacking Labels in the Stream: Classifying Evolving Stream Data with Few Labels. In: Rauch, J., Raś, Z.W., Berka, P., Elomaa, T. (eds.) ISMIS 2009. LNCS, vol. 5722, pp. 552–562. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  13. Ditzler, G., Polikar, R.: Semi-supervised learning in nonstationary environments. IEEE

    Google Scholar 

  14. Kantardzic, M., Ryu, J.W., Walgampaya, C.: Building a New Classifier in an Ensemble Using Streaming Unlabeled Data. In: García-Pedrajas, N., Herrera, F., Fyfe, C., Benítez, J.M., Ali, M. (eds.) IEA/AIE 2010, Part I. LNCS, vol. 6097, pp. 77–86. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  15. Zhou, Z.-H., Li, M.: Tri-Training: Exploiting Unlabeled Data Using Three Classifiers. IEEE Trans. on Knowl. and Data Eng. 17(11), 1529–1541 (2005)

    Article  Google Scholar 

  16. Angluin, D., Laird, P.: Learning From Noisy Examples. Machine Learning 2(4), 343–370 (1988)

    Google Scholar 

  17. Street, W.N., Kim, Y.: A streaming ensemble algorithm (SEA) for large-scale classification. In: Proceedings of the Seventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, San Francisco (2001)

    Google Scholar 

  18. Zhu, X.: Stream Data Mining repository (2010), http://www.cse.fau.edu/~xqzhu/stream.html

  19. Frank, A., Asuncion, A.: UCI Machine Learning Repository (2010), http://archive.ics.uci.edu/ml (cited May 2011)

  20. Katakis, I., Tsoumakas, G., Vlahavas, I.: Tracking recurring contexts using ensemble classifiers: an application to email filtering. Knowledge and Information Systems 22(3), 371–391 (2009)

    Article  Google Scholar 

  21. Harries, M.B., Sammut, C., Horn, K.: Extracting hidden context. Machine Learning 32(2), 101–126 (1998)

    Article  MATH  Google Scholar 

  22. Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann (2005)

    Google Scholar 

  23. Bifet, A., et al.: Moa: Massive online analysis. The Journal of Machine Learning Research 11, 1601–1604

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ahmadi, Z., Beigy, H. (2012). Semi-supervised Ensemble Learning of Data Streams in the Presence of Concept Drift. In: Corchado, E., Snášel, V., Abraham, A., Woźniak, M., Graña, M., Cho, SB. (eds) Hybrid Artificial Intelligent Systems. HAIS 2012. Lecture Notes in Computer Science(), vol 7209. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28931-6_50

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28931-6_50

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28930-9

  • Online ISBN: 978-3-642-28931-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics