Skip to main content
Log in

Episodic memory based continual learning without catastrophic forgetting for environmental sound classification

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

Convolutional neural network suffers from catastrophic forgetting during continual learning. This is one of the major obstacles for artificial intelligence, to solve new problems without forgetting the previously learned information. In this article, we propose an episodic memory technique for learning sound data incrementally. The proposed method observes tasks sequentially and successfully solves the new task without forgetting the previous task. The results show that the proposed method is able to transfer backward and forward knowledge efficiently. The performance evaluation demonstrates that the proposed method achieves better performance than other benchmarks. For ESC-50 and UrbanSound8K datasets, the proposed method obtained 96.5% and 93.1% accuracy, respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Data Availibility Statement

We analysed two datasets, ESC-50 and UrbanSound8K, for the evaluation of this research article. The ESC-50 dataset was developed by [Karol J. Piczak] and is publically available at [1]. While UrbanSound8K dataset was developed by J. Salamon, C. Jacoby, and J. P. Bello and is publically available at [2]. Data supporting Fig. 4, and Table 2, Table 3, and Table 6 are not publicly available to conserve data secrecy. [1] [https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/YDEPUT]. [2] [https://urbansounddataset.weebly.com/urbansound8k.html].

References

  • Ali Z, Talha M (2018) Innovative method for unsupervised voice activity detection and classification of audio segments. IEEE Access 6:15494–15504

    Article  Google Scholar 

  • Aljundi R, Babiloni F, Elhoseiny M, Rohrbach M, Tuytelaars T (2018) Memory aware synapses: Learning what (not) to forget. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 139–154

  • Birajdar GK, Patil MD (2020) Speech/music classification using visual and spectral chromagram features. J Ambient Intell Humaniz Comput 11(1):329–347

    Article  Google Scholar 

  • Castro FM, Marín-Jiménez MJ, Guil N, Schmid C, Alahari K (2018) End-to-end incremental learning. In: Proceedings of the European conference on computer vision (ECCV), pp 233–248

  • Chaudhry A, Ranzato M, Rohrbach M, Elhoseiny M (2018) Efficient lifelong learning with a-gem. arXiv preprint arXiv:1812.00420

  • Chen Z, Liu B (2018) Lifelong machine learning. Synth. Lect. Artif. Intell. Mach. Learn. 12(3):1–207

    Google Scholar 

  • Cotton CV, Ellis DP (2011) Spectral vs. spectro-temporal features for acoustic event detection. In: 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE, pp 69–72

  • Das JK, Ghosh A, Pal AK, Dutta S, Chakrabarty A (2020) Urban sound classification using convolutional neural network and long short term memory based on multiple features. In: 2020 fourth international conference on intelligent computing in data sciences (ICDS), IEEE, pp 1–9

  • Demir F, Turkoglu M, Aslan M, Sengur A (2020) A new pyramidal concatenated cnn approach for environmental sound classification. Appl Acoust 170:107520

    Article  Google Scholar 

  • Druzhkov P, Kustikova V (2016) A survey of deep learning methods and software tools for image classification and object detection. Pattern Recognit Image Anal 26(1):9–15

    Article  Google Scholar 

  • Green M, Murphy D (2020) Environmental sound monitoring using machine learning on mobile devices. Appl Acoust 159:107041

    Article  Google Scholar 

  • Guo Y, Liu M, Yang T, Rosing T (2020) Improved schemes for episodic memory-based lifelong learning. Adv Neural Inf Process Syst 33:1023–1035

    Google Scholar 

  • ul Haq QM, Ruan SJ, Haq MA, Karam S, Shieh JL, Chondro P, Gao DQ (2021) An incremental learning of yolov3 without catastrophic forgetting for smart city applications. In: IEEE consumer electronics magazine

  • He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  • Intani P, Orachon T (2013) Crime warning system using image and sound processing. In: 2013 13th international conference on control, automation and systems (ICCAS 2013), IEEE, pp 1751–1753

  • Jahanjoo A, Naderan M, Rashti MJ (2020) Detection and multi-class classification of falling in elderly people by deep belief network algorithms. J Ambient Intell Human Comput 11(10):4145–4165

    Article  Google Scholar 

  • Joo HR, Frank LM (2018) The hippocampal sharp wave-ripple in memory retrieval for immediate use and consolidation. Nat Rev Neurosci 19(12):744–757

    Article  Google Scholar 

  • Karam S, Ruan SJ, ul Haq QM (2022) Task incremental learning with static memory for audio classification without catastrophic interference. IEEE Consumer Electron Mag

  • Kemker R, Kanan C (2017) Fearnet: Brain-inspired model for incremental learning. arXiv preprint arXiv:1711.10563

  • Kirkpatrick J, Pascanu R, Rabinowitz N, Veness J, Desjardins G, Rusu AA, Milan K, Quan J, Ramalho T, Grabska-Barwinska A et al (2017) Overcoming catastrophic forgetting in neural networks. Proc Natl Acad Sci 114(13):3521–3526

    Article  MathSciNet  MATH  Google Scholar 

  • Koh E, Saki F, Guo Y, Hung CY, Visser E (2020) Incremental learning algorithm for sound event detection. In: 2020 IEEE international conference on multimedia and expo (ICME), IEEE, pp 1–6

  • Krishnaveni P, Sutha J (2020) Novel deep learning framework for broadcasting abnormal events obtained from surveillance applications. J Ambient Intell Hum Comput 1–15

  • Li D, Tasci S, Ghosh S, Zhu J, Zhang J, Heck L (2019) Rilod: Near real-time incremental learning for object detection at the edge. In: Proceedings of the 4th ACM/IEEE symposium on edge computing, pp 113–126

  • Li H, Ishikawa S, Zhao Q, Ebana M, Yamamoto H, Huang J (2007) Robot navigation and sound based position identification. 2007 IEEE International Conference on Systems. Man and Cybernetics, IEEE, pp 2449–2454

    Google Scholar 

  • Li Y, Li Z, Ding L, Pan Y, Huang C, Hu Y, Chen W, Gao X (2018) Supportnet: solving catastrophic forgetting in class incremental learning with support data. arXiv preprint arXiv:1806.02942

  • Li Z, Hoiem D (2017) Learning without forgetting. IEEE Trans Pattern Anal Mach Intell 40(12):2935–2947

    Article  Google Scholar 

  • Lopez-Paz D, Ranzato M (2017) Gradient episodic memory for continual learning. Adv Neural Inf Process Syst 30

  • McCloskey M, Cohen NJ (1989) Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of learning and motivation, vol 24, Elsevier, pp 109–165

  • Messner E, Fediuk M, Swatek P, Scheidl S, Smolle-Jüttner FM, Olschewski H, Pernkopf F (2020) Multi-channel lung sound classification with convolutional recurrent neural networks. Comput Biol Med 122:103831

    Article  Google Scholar 

  • Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z, Gimelshein N, Antiga L et al. (2019) Pytorch: an imperative style, high-performance deep learning library. Adv Neural Inf Process Syst 32

  • Perez L, Wang J (2017) The effectiveness of data augmentation in image classification using deep learning. arXiv preprint arXiv:1712.04621

  • Piczak KJ (2015) Esc: Dataset for environmental sound classification. In: Proceedings of the 23rd ACM international conference on Multimedia, pp 1015–1018

  • Rebuffi SA, Kolesnikov A, Sperl G, Lampert CH (2017) icarl: Incremental classifier and representation learning. In: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp 2001–2010

  • Salamon J, Bello JP (2017) Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process Lett 24(3):279–283

    Article  Google Scholar 

  • Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM international conference on Multimedia, pp 1041–1044

  • Sharma S, Gupta R, Kumar A (2021) Continuous sign language recognition using isolated signs data and deep transfer learning. J Ambient Intell Hum Comput 1–12

  • Shen Y, Cao J, Wang J, Yang Z (2020) Urban acoustic classification based on deep feature transfer learning. J Franklin Inst 357(1):667–686

    Article  MATH  Google Scholar 

  • Shieh JL, Haq QMu, Haq MA, Karam S, Chondro P, Gao DQ, Ruan SJ (2020) Continual learning strategy in one-stage object detection framework based on experience replay for autonomous driving vehicle. Sensors 20(23):6777

    Article  Google Scholar 

  • Shin H, Lee JK, Kim J, Kim J (2017) Continual learning with deep generative replay. Adv Neural Inf Process Syst 30

  • Shmelkov K, Schmid C, Alahari K (2017) Incremental learning of object detectors without catastrophic forgetting. In: Proceedings of the IEEE international conference on computer vision, pp 3400–3409

  • Thangavel S, Shokkalingam CS (2022) The iot based embedded system for the detection and discrimination of animals to avoid human-wildlife conflict. J Ambient Intell Humaniz Comput 13(6):3065–3081

    Article  Google Scholar 

  • Wang Z, Subakan C, Tzinis E, Smaragdis P, Charlin L (2019) Continual learning of new sound classes using generative replay. In: 2019 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), IEEE, pp 308–312

  • Wu X, Sahoo D, Hoi SC (2020) Recent advances in deep learning for object detection. Neurocomputing 396:39–64

    Article  Google Scholar 

  • Wu Y, Chen Y, Wang L, Ye Y, Liu Z, Guo Y, Fu Y (2019) Large scale incremental learning. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 374–382

  • Xin X, Wenhui F, Yuming Y, Bin G, Junhai C, Wei W (2007) Hla based high level modeling and simulation for integrated logistical supporting system. In: 2007 IEEE international conference on automation and logistics, IEEE, pp 2041–2045

  • Zhang J, Zhang J, Ghosh S, Li D, Tasci S, Heck L, Zhang H, Kuo CCJ (2020) Class-incremental learning via deep model consolidation. In: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp 1131–1140

  • Zhu B, Wang C, Liu F, Lei J, Huang Z, Peng Y, Li F (2018) Learning environmental sounds with multi-scale convolutional neural network. In: 2018 international joint conference on neural networks (IJCNN), IEEE, pp 1–8

Download references

Funding

This study was supported by the Ministry of Science and Technology of Taiwan (MOST 111-2314-B-350-002-MY2) and Cheng Hsin General Hospital (CY11003 and CY11107).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lieber Po-Hung Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Karam, S., Ruan, SJ., Haq, Q.M.u. et al. Episodic memory based continual learning without catastrophic forgetting for environmental sound classification. J Ambient Intell Human Comput 14, 4439–4449 (2023). https://doi.org/10.1007/s12652-023-04561-5

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-023-04561-5

Keywords

Navigation