Skip to main content

Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble

  • Conference paper
Advances in Machine Learning (ACML 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5828))

Included in the following conference series:

Abstract

The problem of mining single-label data streams has been extensively studied in recent years. However, not enough attention has been paid to the problem of mining multi-label data streams. In this paper, we propose an improved binary relevance method to take advantage of dependence information among class labels, and propose a dynamic classifier ensemble approach for classifying multi-label concept-drifting data streams. The weighted majority voting strategy is used in our classification algorithm. Our empirical study on both synthetic data set and real-life data set shows that the proposed dynamic classifier ensemble with improved binary relevance approach outperforms dynamic classifier ensemble with binary relevance algorithm, and static classifier ensemble with binary relevance algorithm.

This work is supported by Young Cadreman Supporting Program of Northwest A&F University (01140301). Corresponding author: Zhang Yang.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceeding of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM Press, New York (2003)

    Google Scholar 

  2. Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: ACM SIGKDD, pp. 97–106 (2001)

    Google Scholar 

  3. Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)

    Google Scholar 

  4. Qu, W., Zhang, Y., Zhu, J., Wang, Y.: Mining concept-drifting multi-label data streams using ensemble classifiers. In: Fuzzy Systems and Knowledge Discovery, Tianjin, China (to appear, 2009)

    Google Scholar 

  5. Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2007)

    Google Scholar 

  6. Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining Multi-label Data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, Heidelberg (2009), http://mlkd.csd.auth.gr/multilabel.html

    Google Scholar 

  7. Clare, A., King, R.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 42–53. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  8. Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Machine Learning, 135–168 (2000)

    Google Scholar 

  9. McCallum, A.: Multi-label text classification with a mixture model trained by em. In: Proceedings of the AAAI 1999 Workshop on Text Learning, pp. 681–687 (1999)

    Google Scholar 

  10. Crammer, K., Singer, Y.: A family of additive online algorithms for category ranking. Journal of Machine Learning Research, 1025–1058 (2003)

    Google Scholar 

  11. Elisseeff, A., Weston, J.: A kernel method for multi-labeled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2002)

    Google Scholar 

  12. Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 1338–1351 (2006)

    Google Scholar 

  13. Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004)

    Google Scholar 

  14. Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: Proceedings of the 1st IEEE International Conference on Granular Computing, pp. 718–721 (2005)

    Google Scholar 

  15. Brinker, K., Hullermeier, E.: Case-based multilabel ranking. In: Proceedings of the 20th International Conference on Artificial Intelligence (IJCAI 2007), Hyderabad, India, pp. 702–707 (2007)

    Google Scholar 

  16. Thabtah, F., Cowling, P., Peng, Y.: Mmac: A new multi-class, multi-label associative classification approach. In: Proceedings of the 4th IEEE International Conference on Data Mining, ICDM 2004, pp. 217–224 (2004)

    Google Scholar 

  17. Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report TCD-CS-2004-15, Department of Computer Science, Trinity College Dublin, Ireland (2004)

    Google Scholar 

  18. Street, W., Kim, Y.: A streaming ensemble algorithm (SEA) for large scale classification. In: KDD 2001, 7th International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, August 2001, pp. 377–382 (2001)

    Google Scholar 

  19. Zhu, X., Wu, X., Yang, Y.: Dynamic classifier selection for effective mining from noisy data streams. In: Proceedings of the 4th international conference on Data Mining (ICDM 2004), pp. 305–312 (2004)

    Google Scholar 

  20. Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM 2003, 3rd International Conference on Data Mining, pp. 123–130 (2003)

    Google Scholar 

  21. Zhang, Y., Jin, X.: An automatic construction and organization strategy for ensemble learning on data streams. ACM SIGMOD Record 35(3), 28–33 (2006)

    Article  Google Scholar 

  22. Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Puuronen. S.: Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: Proceedings of 19th International Symposium on Computer-Based Medical Systems (CBMS 2006), pp. 679–684 (2006)

    Google Scholar 

  23. Gao, S., Wu, W., Lee, C.-H., Chua, T.-S.: A maximal figure-of-merit approach to text categorization. In: SIGIR 2003, pp. 174–181 (2003)

    Google Scholar 

  24. Katakis, I., Tsoumakas, G., Vlahavas, I.: Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams. In: ECML/PKDD 2006 International Workshop on Knowledge Discovery from Data Streams, Berlin, Germany, pp. 107–116 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Qu, W., Zhang, Y., Zhu, J., Qiu, Q. (2009). Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_24

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-05224-8_24

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-05223-1

  • Online ISBN: 978-3-642-05224-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics