skip to main content
10.1145/3412841.3441922acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

A new self-organizing map based algorithm for multi-label stream classification

Published: 22 April 2021 Publication History

Abstract

Several algorithms have been proposed for offline multi-label classification. However, applications in areas such as traffic monitoring, social networks, and sensors produce data continuously, the so called data streams, posing challenges to batch multi-label learning. With the lack of stationarity in the distribution of data streams, new algorithms are needed to online adapt to such changes (concept drift). Also, in realistic applications, changes occur in scenarios with infinitely delayed labels, where the true classes of the arrival instances are never available. We propose an online unsupervised incremental method based on self-organizing maps for multi-label stream classification in scenarios with infinitely delayed labels. We consider the existence of an initial set of labeled instances to train a self-organizing map for each label. The learned models are then used and adapted in an evolving stream to classify new instances, considering that their classes will never be available. We adapt to incremental concept drifts by online updating the weight vectors of winner neurons and the dataset label cardinality. Predictions are obtained using the Bayes rule and the outputs of each neuron, adapting the prior probabilities and conditional probabilities of the classes in the stream. Experiments using synthetic and real datasets show that our method is highly competitive with several ones from the literature, in both stationary and concept drift scenarios.

References

[1]
Z. Ahmadi and S. Kramer. 2018. A label compression method for online multi-label classification. Pattern Recogn Lett 111 (2018), 64--71.
[2]
D. Alahakoon, S. K. Halgamuge, and B. Srinivasan. 2000. Dynamic self-organizing maps with controlled growth for knowledge discovery. IEEE T Neural Networ 11, 3 (2000), 601--614.
[3]
A. Bifet, G. Holmes, R. Kirkby, and B. Pfahringer. 2010. Moa: Massive online analysis. J Mach Learn Res 11 (2010), 1601--1604.
[4]
M. R. Boutell, J. Luo, X. Shen, and C. M. Brown. 2004. Learning multi-label scene classification. Pattern Recogn 37 (2004), 1757--1771.
[5]
R. Cerri, R. C. Barros, A. C. P. L. F. de Carvalho, and Y. Jin. 2016. Reduction strategies for hierarchical multi-label classification in protein function prediction. BMC Bioinformatics 17, 1 (2016), 373.
[6]
R. Cerri, M. P. Basgalupp, R. C. Barros, and A. C. P. L. F. Carvalho. 2019. Inducing Hierarchical Multi-label Classification rules with Genetic Algorithms. Appl Soft Comput 77 (2019), 584--604.
[7]
J. Costa Júnior, E. Faria, J. Silva, J. Gama, and R. Cerri. 2019. Novelty Detection for Multi-Label Stream Classification. In Brazilian Conference on Intelligent Systems. 144--149.
[8]
J. Demšar. 2006. Statistical Comparisons of Classifiers over Multiple Data Sets. J. Mach. Learn. Res. 7 (Dec. 2006), 1--30.
[9]
M. Dittenbach, D. Merkl, and A. Rauber. 2000. The growing hierarchical self-organizing map. In Proceedings of the IEEE International Joint Conference on Neural Networks, Vol. 6. 15--19.
[10]
E. R. Faria, I. J. C. R. Gonçalves, André C. P. L. F. de Carvalho, and J. Gama. 2016. Novelty detection in data streams. Artif Intell Rev 45 (2016), 235--269.
[11]
J. Gama, I. Žliobaitundefined, A. Bifet, M. Pechenizkiy, and A. Bouchachia. 2014. A Survey on Concept Drift Adaptation. ACM Comput. Surv. 46, 4 (2014).
[12]
S. S. Haykin. 2009. Neural networks and learning machines (third ed.). Pearson Education, Upper Saddle River, NJ.
[13]
T. Kohonen. 1991. Self-Organizing Maps: Optimization approaches. In Artificial neural networks, Vol. II. 981--990.
[14]
T. Kohonen. 2013. Essentials of the self-organizing map. Neural Networks 37 (2013), 52--65.
[15]
G. Krempl, I. Žliobaite, D. Brzeziundefinedski, E. Hüllermeier, M. Last, V. Lemaire, T. Noack, A. Shaker, S. Sievi, M. Spiliopoulou, and J. Stefanowski. 2014. Open Challenges for Data Stream Mining Research. SIGKDD Explor. Newsl. 16, 1 (2014), 1--10.
[16]
Boris L. and Andreas H. 2003. Automatic multi-label subject indexing in a multilingual environment. In European Conference in Research and Advanced Technology for Digital Libraries. 140--151.
[17]
X. Luo and N. A. Zincir-Heywood. 2005. Evaluation of Two Systems on Multi-class Multi-label Document Classification. In International Syposium on Methodologies for Intelligent Systems. 161--169.
[18]
J. Nam, J. Kim, E. L. Mencia, I. Gurevych, and J. Fürnkranz. 2014. Large-scale multi-label text classification - revisiting neural networks. In Joint european conference on machine learning and knowledge discovery in databases. 437--452.
[19]
T. T. Nguyen, T. T. T. Nguyen, A. V. Luong, Q. V. H. Nguyen, A. W.-C. Liew, and B. Stantic. 2019. Multi-label classification via label correlation and first order feature dependance in a data stream. Pattern Recogn 90 (2019), 35 -- 51.
[20]
S. Oramas, O. Nieto, F. Barbieri, and X. Serra. 2017. Multi-Label Music Genre Classification from Audio, Text and Images Using Deep Features. In Intern. Society for Music Information Retrieval Conference. 23--30.
[21]
A. Osojnik, P. Panov, and S. Džeroski. 2017. Multi-label classification via multi-target regression on data streams. Mach Learn 106, 6 (2017), 745--770.
[22]
R Core Team. 2020. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing.
[23]
J. Read, A. Bifet, G. Holmes, and B. Pfahringer. 2012. Scalable and efficient multi-label classification for evolving data streams. Mach Learn 88, 1--2 (2012), 243--272.
[24]
A. Rivolli. 2016. utiml: Utilities for Multi-Label Learning. R package version 0.1.0.
[25]
K. Sechidis, G. Tsoumakas, and I. Vlahavas. 2011. On the Stratification of Multi-label Data. In ECML/PKDD. 145--158.
[26]
X. Shen, M. Boutell, J. Luo, and C. Brown. 2003. Multilabel machine learning and its application to semantic scene classification. In Society of Photo-Optical Instrumentation Engineers Conference. 188--199.
[27]
Z. Shi, Y. Wen, Y. Xue, and G. Cai. 2014. Efficient class incremental learning for multi-label classification of evolving data streams. In International Joint Conference on Neural Networks. 2093--2099.
[28]
G. Song and Y. Ye. 2014. A new ensemble method for multi-label data stream classification in non-stationary environment. In International Joint Conference on Neural Networks. 1776--1783.
[29]
R. Sousa and J. Gama. 2018. Multi-label classification from high-speed data streams with adaptive model rules and random rules. Progress in Artif Int (2018), 1--11.
[30]
V. M. A. Souza, D. F. Silva, G. E. A. P. A. Batista, and J. Gama. 2015. Classification of evolving data streams with infinitely delayed labels. In International Conference on Machine Learning and Applications. 214--219.
[31]
P. Trajdos and M. Kurzynski. 2015. Multi-label stream classification using extended binary relevance model. In Trustcom / BigDataSE / ISPA, Vol. 2. 205--210.
[32]
K. Trohidis, G. Tsoumakas, G. Kalliris, and I. P. Vlahavas. 2008. Multi-Label Classification of Music into Emotions. In International Conference on Music Information Retrieval. 325--330.
[33]
G. Tsoumakas, I. Katakis, and I. Vlahavas. 2010. Mining Multi-label Data. Springer US, Boston, MA, 667--685.
[34]
G. Tsoumakas, E. S-Xioufis, J. Vilcek, and I. Vlahavas. 2011. Mulan: A Java Library for Multi-Label Learning. J Mach Learn Res 12 (2011), 2411--2414.
[35]
C. Vens, J. Struyf, L. Schietgat, S. Džeroski, and H. Blockeel. 2008. Decision trees for hierarchical multi-label classification. Mach Learn 73, 2 (2008), 185--214.
[36]
P. Wang, P. Zhang, and L. Guo. 2012. Mining multi-label data streams using ensemble-based active learning. In Intern. conference on data mining. 1131--1140.
[37]
R. Wehrens and J Kruisselbrink. 2018. Flexible Self-Organizing Maps in kohonen 3.0. J Stat Softw 87, 7 (2018), 1--18.
[38]
Y. Zhu, K. M. Ting, and Z-H. Zhou. 2018. Multi-label learning with emerging new labels. IEEE T Knowl Data En 30 (2018), 1901--1914.

Cited By

View all
  • (2024)Evaluating the Performance of an Incremental Classifier using Clustered-C4.5 Algorithm for Processing Big Data Streams2024 5th International Conference on Communication, Computing & Industry 6.0 (C2I6)10.1109/C2I663243.2024.10894952(1-12)Online publication date: 6-Dec-2024
  • (2024)Hoeffding adaptive trees for multi-label classification on data streamsKnowledge-Based Systems10.1016/j.knosys.2024.112561304:COnline publication date: 25-Nov-2024
  • (2023)Concept Drift Adaptation Methods under the Deep Learning Framework: A Literature ReviewApplied Sciences10.3390/app1311651513:11(6515)Online publication date: 26-May-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SAC '21: Proceedings of the 36th Annual ACM Symposium on Applied Computing
March 2021
2075 pages
ISBN:9781450381048
DOI:10.1145/3412841
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 22 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. classification
  2. concept drift
  3. data streams
  4. machine learning
  5. multi-label
  6. self-organizing maps

Qualifiers

  • Research-article

Funding Sources

  • FAPESP

Conference

SAC '21
Sponsor:
SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing
March 22 - 26, 2021
Virtual Event, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

Upcoming Conference

SAC '25
The 40th ACM/SIGAPP Symposium on Applied Computing
March 31 - April 4, 2025
Catania , Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)10
  • Downloads (Last 6 weeks)1
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Evaluating the Performance of an Incremental Classifier using Clustered-C4.5 Algorithm for Processing Big Data Streams2024 5th International Conference on Communication, Computing & Industry 6.0 (C2I6)10.1109/C2I663243.2024.10894952(1-12)Online publication date: 6-Dec-2024
  • (2024)Hoeffding adaptive trees for multi-label classification on data streamsKnowledge-Based Systems10.1016/j.knosys.2024.112561304:COnline publication date: 25-Nov-2024
  • (2023)Concept Drift Adaptation Methods under the Deep Learning Framework: A Literature ReviewApplied Sciences10.3390/app1311651513:11(6515)Online publication date: 26-May-2023
  • (2022)An Algorithm Adaptation Method for Multi-Label Stream Classification using Self-Organizing Maps2022 21st IEEE International Conference on Machine Learning and Applications (ICMLA)10.1109/ICMLA55696.2022.00276(1071-1076)Online publication date: Dec-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media