Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble

Qu, Wei; Zhang, Yang; Zhu, Junping; Qiu, Qiang

doi:10.1007/978-3-642-05224-8_24

Wei Qu²¹,
Yang Zhang²¹,
Junping Zhu²¹ &
…
Qiang Qiu²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5828))

Included in the following conference series:

Asian Conference on Machine Learning

2582 Accesses

Abstract

The problem of mining single-label data streams has been extensively studied in recent years. However, not enough attention has been paid to the problem of mining multi-label data streams. In this paper, we propose an improved binary relevance method to take advantage of dependence information among class labels, and propose a dynamic classifier ensemble approach for classifying multi-label concept-drifting data streams. The weighted majority voting strategy is used in our classification algorithm. Our empirical study on both synthetic data set and real-life data set shows that the proposed dynamic classifier ensemble with improved binary relevance approach outperforms dynamic classifier ensemble with binary relevance algorithm, and static classifier ensemble with binary relevance algorithm.

This work is supported by Young Cadreman Supporting Program of Northwest A&F University (01140301). Corresponding author: Zhang Yang.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unsupervised concept drift detection for multi-label data streams

Article 17 July 2022

Recurring Drift Detection and Model Selection-Based Ensemble Classification for Data Streams with Unlabeled Data

Article 20 April 2021

Weighted Ensemble Classification of Multi-label Data Streams

References

Wang, H., Fan, W., Yu, P.S., Han, J.: Mining concept-drifting data streams using ensemble classifiers. In: Proceeding of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 226–235. ACM Press, New York (2003)
Google Scholar
Hulten, G., Spencer, L., Domingos, P.: Mining Time-Changing Data Streams. In: ACM SIGKDD, pp. 97–106 (2001)
Google Scholar
Widmer, G., Kubat, M.: Learning in the presence of concept drift and hidden contexts. Machine Learning 23(1), 69–101 (1996)
Google Scholar
Qu, W., Zhang, Y., Zhu, J., Wang, Y.: Mining concept-drifting multi-label data streams using ensemble classifiers. In: Fuzzy Systems and Knowledge Discovery, Tianjin, China (to appear, 2009)
Google Scholar
Tsoumakas, G., Katakis, I.: Multi-label classification: An overview. International Journal of Data Warehousing and Mining 3(3), 1–13 (2007)
Google Scholar
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining Multi-label Data. In: Maimon, O., Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook, 2nd edn. Springer, Heidelberg (2009), http://mlkd.csd.auth.gr/multilabel.html
Google Scholar
Clare, A., King, R.: Knowledge discovery in multi-label phenotype data. In: Siebes, A., De Raedt, L. (eds.) PKDD 2001. LNCS (LNAI), vol. 2168, pp. 42–53. Springer, Heidelberg (2001)
Chapter Google Scholar
Schapire, R.E., Singer, Y.: Boostexter: a boosting-based system for text categorization. Machine Learning, 135–168 (2000)
Google Scholar
McCallum, A.: Multi-label text classification with a mixture model trained by em. In: Proceedings of the AAAI 1999 Workshop on Text Learning, pp. 681–687 (1999)
Google Scholar
Crammer, K., Singer, Y.: A family of additive online algorithms for category ranking. Journal of Machine Learning Research, 1025–1058 (2003)
Google Scholar
Elisseeff, A., Weston, J.: A kernel method for multi-labeled classification. In: Advances in Neural Information Processing Systems, pp. 681–687 (2002)
Google Scholar
Zhang, M.L., Zhou, Z.H.: Multi-label neural networks with applications to functional genomics and text categorization. IEEE Transactions on Knowledge and Data Engineering, 1338–1351 (2006)
Google Scholar
Godbole, S., Sarawagi, S.: Discriminative methods for multi-labeled classification. In: Dai, H., Srikant, R., Zhang, C. (eds.) PAKDD 2004. LNCS (LNAI), vol. 3056, pp. 22–30. Springer, Heidelberg (2004)
Google Scholar
Zhang, M.L., Zhou, Z.H.: A k-nearest neighbor based algorithm for multi-label classification. In: Proceedings of the 1st IEEE International Conference on Granular Computing, pp. 718–721 (2005)
Google Scholar
Brinker, K., Hullermeier, E.: Case-based multilabel ranking. In: Proceedings of the 20th International Conference on Artificial Intelligence (IJCAI 2007), Hyderabad, India, pp. 702–707 (2007)
Google Scholar
Thabtah, F., Cowling, P., Peng, Y.: Mmac: A new multi-class, multi-label associative classification approach. In: Proceedings of the 4th IEEE International Conference on Data Mining, ICDM 2004, pp. 217–224 (2004)
Google Scholar
Tsymbal, A.: The problem of concept drift: definitions and related work. Technical Report TCD-CS-2004-15, Department of Computer Science, Trinity College Dublin, Ireland (2004)
Google Scholar
Street, W., Kim, Y.: A streaming ensemble algorithm (SEA) for large scale classification. In: KDD 2001, 7th International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, August 2001, pp. 377–382 (2001)
Google Scholar
Zhu, X., Wu, X., Yang, Y.: Dynamic classifier selection for effective mining from noisy data streams. In: Proceedings of the 4th international conference on Data Mining (ICDM 2004), pp. 305–312 (2004)
Google Scholar
Kolter, J.Z., Maloof, M.A.: Dynamic weighted majority: a new ensemble method for tracking concept drift. In: ICDM 2003, 3rd International Conference on Data Mining, pp. 123–130 (2003)
Google Scholar
Zhang, Y., Jin, X.: An automatic construction and organization strategy for ensemble learning on data streams. ACM SIGMOD Record 35(3), 28–33 (2006)
Article Google Scholar
Tsymbal, A., Pechenizkiy, M., Cunningham, P.: Puuronen. S.: Handling local concept drift with dynamic integration of classifiers: domain of antibiotic resistance in nosocomial infections. In: Proceedings of 19th International Symposium on Computer-Based Medical Systems (CBMS 2006), pp. 679–684 (2006)
Google Scholar
Gao, S., Wu, W., Lee, C.-H., Chua, T.-S.: A maximal figure-of-merit approach to text categorization. In: SIGIR 2003, pp. 174–181 (2003)
Google Scholar
Katakis, I., Tsoumakas, G., Vlahavas, I.: Dynamic Feature Space and Incremental Feature Selection for the Classification of Textual Data Streams. In: ECML/PKDD 2006 International Workshop on Knowledge Discovery from Data Streams, Berlin, Germany, pp. 107–116 (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Information Engineering, Northwest A&F University, Yangling, Shaanxi Province, P.R. China, 712100
Wei Qu, Yang Zhang, Junping Zhu & Qiang Qiu

Authors

Wei Qu
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Junping Zhu
View author publications
You can also search for this author in PubMed Google Scholar
Qiang Qiu
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

National Laboratory for Novel Software Technology, Nanjing University, 22 Hankou Road, 210093, Nanjing, China
Zhi-Hua Zhou
The Institute of Scientific and Industrial Research, Osaka University, 8-1 Mihogaoka, 567, Osaka, Ibaraki, Japan
Takashi Washio

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Qu, W., Zhang, Y., Zhu, J., Qiu, Q. (2009). Mining Multi-label Concept-Drifting Data Streams Using Dynamic Classifier Ensemble. In: Zhou, ZH., Washio, T. (eds) Advances in Machine Learning. ACML 2009. Lecture Notes in Computer Science(), vol 5828. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05224-8_24

Download citation

DOI: https://doi.org/10.1007/978-3-642-05224-8_24
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05223-1
Online ISBN: 978-3-642-05224-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics