Abstract
The main motivation for the presented research was to investigate the behavior of different convolutional neural network architectures in the analysis of non-stationary data streams. Learning a model on continuously incoming data is different from learning where a complete learning set is immediately available. However, streaming data is definitely closer to reality, as nowadays, most data needs to be analyzed as soon as it arrives (e.g., in the case of anti-fraud systems, cybersecurity, and analysis of images from on-board cameras and other sensors). Besides the vital aspect related to the limitations of computational and memory resources that the proposed algorithms must consider, one of the critical difficulties is the possibility of concept drift. This phenomenon means that the probabilistic characteristics of the considered task change, and this, in consequence, may lead to a significant decrease in classification accuracy. This paper pays special attention to models of convolutional neural networks based on probabilistic methods: Monte Carlo dropout and Bayesian convolutional neural networks. Of particular interest was the aspect related to the uncertainty of predictions returned by the model. Such a situation may occur mainly during the classification of drifting data streams. Under such conditions, the prediction system should be able to return information about the high uncertainty of predictions and the need to take action to update the model used. This paper aims to study the behavior of the network of the models mentioned above in the task of classification of non-stationary data streams and to determine the impact of the occurrence of a sudden drift on the accuracy and uncertainty of the predictions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blundell, C., Cornebise, J., Kavukcuoglu, K., Wierstra, D.: Weight uncertainty in neural networks (2015)
Delange, M., et al.: A continual learning survey: defying forgetting in classification tasks. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2021). https://doi.org/10.1109/tpami.2021.3057446, https://doi.org/10.1109%2Ftpami.2021.3057446
Duda, P., Jaworski, M., Cader, A., Wang, L.: On training deep neural networks using a streaming approach. J. Artif. Intell. Soft Comput. Res. 10(1), 15–26 (2020). https://doi.org/10.2478/jaiscr-2020-0002
Gal, Y., Ghahramani, Z.: Dropout as a Bayesian approximation: representing model uncertainty in deep learning. In: 33rd International Conference on Machine Learning, ICML 2016, vol. 3, pp. 1651–1660 (2016)
Guzy, F., Wozniak, M.: Employing dropout regularization to classify recurring drifted data streams. In: Proceedings of the International Joint Conference on Neural Networks (2020). https://doi.org/10.1109/IJCNN48605.2020.9207266
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors (2012)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization (2014)
Krawczyk, B., Minku, L.L., Gama, J., Stefanowski, J., Woźniak, M.: Ensemble learning for data stream analysis: a survey. Inf. Fusion 37, 132–156 (2017)
Krizhevsky, A., Nair, V., Hinton, G.: Cifar-10 (Canadian institute for advanced research). http://www.cs.toronto.edu/kriz/cifar.html
Jospin, L.V., Laga, H., Boussaid, F., Buntine, W., Bennamoun, M.: Hands-on bayesian neural networks - a tutorial for deep learning users. CoRR abs/2007.06823 (2020). https://arxiv.org/abs/2007.06823
LeCun, Y.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1, 541–551 (1989). https://doi.org/10.1162/neco.1989.1.4.541
LeCun, Y., Cortes, C.: MNIST handwritten digit database (2010). http://yann.lecun.com/exdb/mnist/
Shaker, A., Hüllermeier, E.: Recovery analysis for adaptive learning from non-stationary data streams: Experimental design and case study. Neurocomputing 150, 250–264 (2015)
Shamir, O., Zhang, T.: Stochastic gradient descent for non-smooth optimization: convergence results and optimal averaging schemes. In: Dasgupta, S., McAllester, D. (eds.) Proceedings of the 30th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 28, pp. 71–79. PMLR, Atlanta, Georgia, USA, 17–19 June 2013
Acknowledgements
This work is supported in part by the CEUS-UNISONO programme, which has received funding from the National Science Centre, Poland under grant agreement No. 2020/02/Y/ST6/00037 and by the Research Fund of Department of Systems and Computer Networks, Faculty of ICT, Wroclaw University of Science and Technology.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Jasiński, M., Woźniak, M. (2023). Employing Convolutional Neural Networks for Continual Learning. In: Rutkowski, L., Scherer, R., Korytkowski, M., Pedrycz, W., Tadeusiewicz, R., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2022. Lecture Notes in Computer Science(), vol 13588. Springer, Cham. https://doi.org/10.1007/978-3-031-23492-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-23492-7_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-23491-0
Online ISBN: 978-3-031-23492-7
eBook Packages: Computer ScienceComputer Science (R0)