Abstract
Convolutional neural network suffers from catastrophic forgetting during continual learning. This is one of the major obstacles for artificial intelligence, to solve new problems without forgetting the previously learned information. In this article, we propose an episodic memory technique for learning sound data incrementally. The proposed method observes tasks sequentially and successfully solves the new task without forgetting the previous task. The results show that the proposed method is able to transfer backward and forward knowledge efficiently. The performance evaluation demonstrates that the proposed method achieves better performance than other benchmarks. For ESC-50 and UrbanSound8K datasets, the proposed method obtained 96.5% and 93.1% accuracy, respectively.
Similar content being viewed by others
Data Availibility Statement
We analysed two datasets, ESC-50 and UrbanSound8K, for the evaluation of this research article. The ESC-50 dataset was developed by [Karol J. Piczak] and is publically available at [1]. While UrbanSound8K dataset was developed by J. Salamon, C. Jacoby, and J. P. Bello and is publically available at [2]. Data supporting Fig. 4, and Table 2, Table 3, and Table 6 are not publicly available to conserve data secrecy. [1] [https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YDEPUT]. [2] [https://urbansounddataset.weebly.com/urbansound8k.html].
References
Ali Z, Talha M (2018) Innovative method for unsupervised voice activity detection and classification of audio segments. IEEE Access 6:15494–15504
Aljundi R, Babiloni F, Elhoseiny M, Rohrbach M, Tuytelaars T (2018) Memory aware synapses: Learning what (not) to forget. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 139–154
Birajdar GK, Patil MD (2020) Speech/music classification using visual and spectral chromagram features. J Ambient Intell Humaniz Comput 11(1):329–347
Castro FM, MarÃn-Jiménez MJ, Guil N, Schmid C, Alahari K (2018) End-to-end incremental learning. In: Proceedings of the European conference on computer vision (ECCV), pp 233–248
Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2018) Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420
Chen Z, Liu B (2018) Lifelong machine learning. Synth. Lect. Artif. Intell. Mach. Learn. 12(3):1–207
Cotton CV, Ellis DP (2011) Spectral vs. spectro-temporal features for acoustic event detection. In: 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE, pp 69–72
Das JK, Ghosh A, Pal AK, Dutta S, Chakrabarty A (2020) Urban sound classification using convolutional neural network and long short term memory based on multiple features. In: 2020 fourth international conference on intelligent computing in data sciences (ICDS), IEEE, pp 1–9
Demir F, Turkoglu M, Aslan M, Sengur A (2020) A new pyramidal concatenated cnn approach for environmental sound classification. Appl Acoust 170:107520
Druzhkov P, Kustikova V (2016) A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal 26(1):9–15
Green M, Murphy D (2020) Environmental sound monitoring using machine learning on mobile devices. Appl Acoust 159:107041
Guo Y, Liu M, Yang T, Rosing T (2020) Improved schemes for episodic memory-based lifelong learning. Adv Neural Inf Process Syst 33:1023–1035
ul Haq QM, Ruan SJ, Haq MA, Karam S, Shieh JL, Chondro P, Gao DQ (2021) An incremental learning of yolov3 without catastrophic forgetting for smart city applications. In: IEEE consumer electronics magazine
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Intani P, Orachon T (2013) Crime warning system using image and sound processing. In: 2013 13th international conference on control, automation and systems (ICCAS 2013), IEEE, pp 1751–1753
Jahanjoo A, Naderan M, Rashti MJ (2020) Detection and multi-class classification of falling in elderly people by deep belief network algorithms. J Ambient Intell Human Comput 11(10):4145–4165
Joo HR, Frank LM (2018) The hippocampal sharp wave-ripple in memory retrieval for immediate use and consolidation. Nat Rev Neurosci 19(12):744–757
Karam S, Ruan SJ, ul Haq QM (2022) Task incremental learning with static memory for audio classification without catastrophic interference. IEEE Consumer Electron Mag
Kemker R, Kanan C (2017) Fearnet: Brain-inspired model for incremental learning. arXiv preprint arXiv:1711.10563
Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526
Koh E, Saki F, Guo Y, Hung CY, Visser E (2020) Incremental learning algorithm for sound event detection. In: 2020 IEEE international conference on multimedia and expo (ICME), IEEE, pp 1–6
Krishnaveni P, Sutha J (2020) Novel deep learning framework for broadcasting abnormal events obtained from surveillance applications. J Ambient Intell Hum Comput 1–15
Li D, Tasci S, Ghosh S, Zhu J, Zhang J, Heck L (2019) Rilod: Near real-time incremental learning for object detection at the edge. In: Proceedings of the 4th ACM/IEEE symposium on edge computing, pp 113–126
Li H, Ishikawa S, Zhao Q, Ebana M, Yamamoto H, Huang J (2007) Robot navigation and sound based position identification. 2007 IEEE International Conference on Systems. Man and Cybernetics, IEEE, pp 2449–2454
Li Y, Li Z, Ding L, Pan Y, Huang C, Hu Y, Chen W, Gao X (2018) Supportnet: solving catastrophic forgetting in class incremental learning with support data. arXiv preprint arXiv:1806.02942
Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40(12):2935–2947
Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. Adv Neural Inf Process Syst 30
McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of learning and motivation, vol 24, Elsevier, pp 109–165
Messner E, Fediuk M, Swatek P, Scheidl S, Smolle-Jüttner FM, Olschewski H, Pernkopf F (2020) Multi-channel lung sound classification with convolutional recurrent neural networks. Comput Biol Med 122:103831
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al. (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32
Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621
Piczak KJ (2015) Esc: Dataset for environmental sound classification. In: Proceedings of the 23rd ACM international conference on Multimedia, pp 1015–1018
Rebuffi SA, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 2001–2010
Salamon J, Bello JP (2017) Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process Lett 24(3):279–283
Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM international conference on Multimedia, pp 1041–1044
Sharma S, Gupta R, Kumar A (2021) Continuous sign language recognition using isolated signs data and deep transfer learning. J Ambient Intell Hum Comput 1–12
Shen Y, Cao J, Wang J, Yang Z (2020) Urban acoustic classification based on deep feature transfer learning. J Franklin Inst 357(1):667–686
Shieh JL, Haq QMu, Haq MA, Karam S, Chondro P, Gao DQ, Ruan SJ (2020) Continual learning strategy in one-stage object detection framework based on experience replay for autonomous driving vehicle. Sensors 20(23):6777
Shin H, Lee JK, Kim J, Kim J (2017) Continual learning with deep generative replay. Adv Neural Inf Process Syst 30
Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409
Thangavel S, Shokkalingam CS (2022) The iot based embedded system for the detection and discrimination of animals to avoid human-wildlife conflict. J Ambient Intell Humaniz Comput 13(6):3065–3081
Wang Z, Subakan C, Tzinis E, Smaragdis P, Charlin L (2019) Continual learning of new sound classes using generative replay. In: 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE, pp 308–312
Wu X, Sahoo D, Hoi SC (2020) Recent advances in deep learning for object detection. Neurocomputing 396:39–64
Wu Y, Chen Y, Wang L, Ye Y, Liu Z, Guo Y, Fu Y (2019) Large scale incremental learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 374–382
Xin X, Wenhui F, Yuming Y, Bin G, Junhai C, Wei W (2007) Hla based high level modeling and simulation for integrated logistical supporting system. In: 2007 IEEE international conference on automation and logistics, IEEE, pp 2041–2045
Zhang J, Zhang J, Ghosh S, Li D, Tasci S, Heck L, Zhang H, Kuo CCJ (2020) Class-incremental learning via deep model consolidation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1131–1140
Zhu B, Wang C, Liu F, Lei J, Huang Z, Peng Y, Li F (2018) Learning environmental sounds with multi-scale convolutional neural network. In: 2018 international joint conference on neural networks (IJCNN), IEEE, pp 1–8
Funding
This study was supported by the Ministry of Science and Technology of Taiwan (MOST 111-2314-B-350-002-MY2) and Cheng Hsin General Hospital (CY11003 and CY11107).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Karam, S., Ruan, SJ., Haq, Q.M.u. et al. Episodic memory based continual learning without catastrophic forgetting for environmental sound classification. J Ambient Intell Human Comput 14, 4439–4449 (2023). https://doi.org/10.1007/s12652-023-04561-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12652-023-04561-5