skip to main content
survey

A Survey on Deep Learning Based Forest Environment Sound Classification at the Edge

Published: 05 October 2023 Publication History

Abstract

Forest ecosystems are of paramount importance to the sustainable existence of life on earth. Unique natural and artificial phenomena pose severe threats to the perseverance of such ecosystems. With the advancement of artificial intelligence technologies, the effectiveness of implementing forest monitoring systems based on acoustic surveillance has been established due to the practicality of the approach. It can be identified that with the support of transfer learning, deep learning algorithms outperform conventional machine learning algorithms for forest acoustic classification. Further, a clear requirement to move the conventional cloud-based sound classification to the edge is raised among the research community to ensure real-time identification of acoustic incidents. This article presents a comprehensive survey on the state-of-the-art forest sound classification approaches, publicly available datasets for forest acoustics, and the associated infrastructure. Further, we discuss the open challenges and future research aspects that govern forest acoustic classification.

References

[1]
Olusola O. Abayomi-Alli, Robertas Damaševičius, Atika Qazi, Mariam Adedoyin-Olowe, and Sanjay Misra. 2022. Data augmentation and deep learning methods in sound classification: A systematic review. Electronics 11, 22 (2022), 3795.
[2]
Sajjad Abdoli, Patrick Cardinal, and Alessandro Lameiras Koerich. 2019. End-to-end environmental sound classification using a 1D convolutional neural network. Expert Systems with Applications 136 (2019), 252–263.
[3]
Jakob Abeßer. 2020. A review of deep learning based methods for acoustic scene classification. Applied Sciences 10, 6 (2020), 1–16.
[4]
Sainath Adapa. 2019. Urban sound tagging using convolutional neural networks. In Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE’19). 5–9.
[5]
Shalbbya Ali, Safdar Tanweer, Syed Sibtain Khalid, and Naseem Rao. 2021. Mel frequency cepstral coefficient: A review. In Proceedings of the 2nd International Conference on ICT for Digital, Smart, and Sustainable Development (ICIDSSD’21). 92–101.
[6]
Alessandro Andreadis, Giovanni Giambene, and Riccardo Zambon. 2021. Monitoring illegal tree cutting through ultra-low-power smart IoT devices. Sensors 21, 22 (2021), 7593.
[7]
Paolo Annesi, Roberto Basili, Raffaele Gitto, Alessandro Moschitti, and Riccardo Petitti. 2007. Audio feature engineering for automatic music genre classification. In Proceedings of Large Scale Semantic Access to Content (Text, Image, Video, and Sound) (RIAO’07). 702–711.
[8]
AudioSet. 2021. AudioSet—A Large-Scale Dataset of Manually Annotated Audio Events. Retrieved July 20, 2022 from https://research.google.com/audioset/download.html
[9]
Miroslav Babiš, Maroš Ďuríček, Valéria Harvanová, and Martin Vojtko. 2011. Forest Guardian—Monitoring system for detecting logging activities based on sound recognition. Researching Solutions in Artificial Intelligence, Computer Graphics and Multimedia, IIT.SRC 2011 (2011), 1–6.
[10]
Meelan Bandara, Roshinie Jayasundara, Isuru Ariyarathne, Dulani Meedeniya, and Charith Perera. 2023. Forest sound classification dataset: FSC22. Sensors 23 (2023), 1–22.
[11]
Anam Bansal and Naresh Kumar Garg. 2022. Environmental sound classification: A descriptive review of the literature. Intelligent Systems with Applications 16 (2022), 200115.
[12]
Bipendra Basnyat, Nirmalya Roy, Aryya Gangopadhyay, and Adrienne Raglin. 2022. Environmental sound classification for flood event detection. In Proceedings of the 18th International Conference on Intelligent Environments. IEEE, Los Alamitos, CA, 1–8.
[13]
BBC. 2020. BBC Sound Effects. Retrieved July 20, 2022 from https://sound-effects.bbcrewind.co.uk
[14]
Carol Bedoya, Claudia Isaza, Juan M. Daza, and José D. López. 2017. Automatic identification of rainfall in acoustic recordings. Ecological Indicators 75 (2017), 95–100.
[15]
K. Manasvi Bhat, Manan Bhandari, ChangSeok Oh, Sujin Kim, and Jeeho Yoo. 2020. Transfer learning based automatic model creation tool for resource constraint devices. arXiv abs/2012.10056 (2020).
[16]
Mark Cartwright, Jason Cramer, Ana Elisa Méndez Méndez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello. 2020. SONYC-UST-V2: An urban sound tagging dataset with spatiotemporal context. In Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE’20). 1–5.
[17]
Mark Cartwright, Jason Cramer, Ana Elisa Mendez Mendez, Yu Wang, Ho-Hsiang Wu, Vincent Lostanlen, Magdalena Fuentes, Graham Dove, Charlie Mydlarz, Justin Salamon, Oded Nov, and Juan Pablo Bello. 2020. SONYC Urban Sound Tagging (SONYC-UST): A Multilabel Dataset from an Urban Acoustic Sensor Network. Retrieved July 20, 2022 from
[18]
Mark Cartwright, Jason Cramer, Justin Salamon, and Juan Pablo Bello. 2019. Tricycle: Audio representation learning from sensor network data using self-supervision. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’19). IEEE, Los Alamitos, CA, 278–282.
[19]
M. Cartwright, A. Mendez, J. Cramer, V. Lostanlen, G. Dove, H. Wu, J. Salamon, O. Nov, and J. Bello. 2019. SONYC Urban Sound Tagging (SONYC-UST): A multilabel dataset from an urban acoustic sensor network. In Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop. 35–39.
[20]
Gianmarco Cerutti, Renzo Andri, Lukas Cavigelli, Elisabetta Farella, Michele Magno, and Luca Benini. 2020. Sound event detection with binary neural networks on tightly power-constrained IoT devices. In Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design. ACM, New York, NY, 19–24.
[21]
Gianmarco Cerutti, Rahul Prasad, Alessio Brutti, and Elisabetta Farella. 2020. Compact recurrent neural networks for acoustic event detection on low-energy low-complexity platforms. IEEE Journal of Selected Topics in Signal Processing 14, 4 (2020), 654–664.
[22]
C. Chalmers, P. Fergus, S. Wich, and S. N. Longmore. 2021. Modelling animal biodiversity using acoustic monitoring and deep learning. In Proceedings of the 2021 International Joint Conference on Neural Networks (IJCNN’21). IEEE, Los Alamitos, CA, 1–7.
[23]
Chuan-Yu Chang and Yi-Ping Chang. 2013. Application of abnormal sound recognition system for indoor environment. In Proceedings of the 9th International Conference on Information, Communications, and Signal Processing. IEEE, Los Alamitos, CA, 1–5.
[24]
Nitin Kumar Chauhan and Krishna Singh. 2018. A review on conventional machine learning vs deep learning. In Proceedings of the International Conference on Computing, Power, and Communication Technologies. IEEE, Los Alamitos, CA, 347–352.
[25]
Chai Chen, Yuxuan Liu, Haoran Sun, and Moyan Zhou. 2019. Audio Feature Extraction and Classification for Urban Sound. Retrieved September 11, 2023 from https://github.com/yuxuan3713/ECE-228-Project
[26]
Jasmine Chhikara. 2021. Transfer learning models based environment audio classification. International Journal of Emerging Technologies in Engineering Research 9 (2021), 1–8.
[27]
Selina Chu, Shrikanth Narayanan, and C.-C. Jay Kuo. 2009. Environmental sound recognition with time–frequency audio features. IEEE Transactions on Audio, Speech, and Language Processing 17, 6 (2009), 1142–1158.
[28]
Junyoung Chung, Çaglar Gülçehre, Kyunghyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. In Proceedings of the NIPS Workshop on Deep Learning. 1–9.
[29]
Abigail Copiaco, Christian Ritz, Nidhal Abdulaziz, and Stefano Fasciani. 2021. A study of features and deep neural network architectures and hyper-parameters for domestic audio classification. Applied Sciences 11, 11 (2021), 4880.
[30]
Thi Thuy An Dang and Thi Kieu Tran. 2016. Audio scene classification using gated recurrent neural network. In Proceedings of the Conference on Information Technology and Its Applications (CITA’16). IEEE, Los Alamitos, CA, 48–51.
[31]
Joy Krishan Das, Ghosh Arka, Pal Abhijit Kumar, Dutta Sumit, and Chakrabarty Amitabha. 2020. Urban sound classification using convolutional neural network and long short term memory based on multiple features. In Proceedings of the 2020 4th International Conference on Intelligent Computing in Data Sciences (ICDS’20). IEEE, Los Alamitos, CA, 1–9.
[32]
Aveen Dayal, Sreenivasa Reddy Yeduri, Balu Harshavardan Koduru, Rahul Kumar Jaiswal, J. Soumya, M. B. Srinivas, Om Jee Pandey, and Linga Reddy Cenkeramaddi. 2022. Lightweight deep convolutional neural network for background sound classification in speech signals. Journal of the Acoustical Society of America 151, 4 (2022), 2773–2786.
[33]
Jonathan Dennis, Huy Dat Tran, and Haizhou Li. 2011. Spectrogram image feature for sound event classification in mismatched conditions. IEEE Signal Processing Letters 18, 2 (2011), 130–133.
[34]
Itxasne Diez Gaspon, Ibon Saratxaga, and Karmele Lopez de Ipiña. 2019. Deep learning for natural sound classification. In INTER-NOISE and NOISE-CON Congress and Conference Proceedings, Vol. 259. Institute of Noise Control Engineering, Madrid, Spain, 5683–5692.
[35]
Golnoosh Elhami and Romann M. Weber. 2019. Audio feature extraction with convolutional neural autoencoders with application to voice conversion. In Infoscience EPFL Scientific Publications. EPFL Scientific Publications, Lausanne, Switzerland, 1–5. http://infoscience.epfl.ch/record/261268
[36]
David Elliott, Evan Martino, Carlos E. Otero, Anthony Smith, Adrian M. Peter, Benjamin Luchterhand, Eric Lam, and Steven Leung. 2020. Cyber-physical analytics: Environmental sound classification at the edge. In Proceedings of the 2020 IEEE 6th World Forum on Internet of Things (WF-IoT’20). IEEE, Los Alamitos, CA, 1–6.
[37]
David Elliott, Carlos E. Otero, Steven Wyatt, and Evan Martino. 2021. Tiny transformers for environmental sound classification at the edge. arXiv abs/2103.12157 (2021).
[38]
Marcelo Fernandes, Weverton Cordeiro, and Mariana Recamonde-Mendoza. 2021. Detecting Aedes aegypti mosquitoes through audio classification with convolutional neural networks. Computers in Biology & Medicine 129 (2021), 104152.
[39]
Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, and Xavier Serra. 2020. FSD50K: An Open Dataset of Human-Labeled Sound Events. Retrieved July 20, 2022 from
[40]
Eduardo Fonseca, Xavier Favory, Jordi Pons, Frederic Font, and Xavier Serra. 2022. FSD50K: An open dataset of human-labeled sound events. IEEE/ACM Transactions on Audio, Speech and Language Processing 30, 24 (2022), 829–852.
[41]
Eduardo Fonseca, Jordi Pons, Xavier Favory, Frederic Font, Dmitry Bogdanov, Andrés Ferraro, Sergio Oramas, Alastair Porter, and Xavier Serra. 2017. Freesound datasets: A platform for the creation of open audio datasets. In Proceedings of the 18th International Society for Music Information Retrieval Conference (ISMIR’17). 486–493.
[42]
Frederic Font, Gerard Roma, and Xavier Serra. 2013. Freesound technical demo. In Proceedings of the 21st ACM International Conference on Multimedia. ACM, New York, NY, 411–412.
[43]
Avi Gazneli, Gadi Zimerman, T. Ridnik, Gilad Sharir, and Asaf Noy. 2022. End-to-end audio strikes back: Boosting augmentations towards an efficient audio classification network. arXiv abs/2204.11479 (2022).
[44]
Jort F. Gemmeke, Daniel P. W. Ellis, Dylan Freedman, Aren Jansen, Wade Lawrence, R. Channing Moore, Manoj Plakal, and Marvin Ritter. 2017. Audio Set: An ontology and human-labeled dataset for audio events. In Proceedings of the 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’17). 776–780.
[45]
Marius Vasile Ghiurcau, Corneliu Rusu, Radu Ciprian Bilcu, and Jaakko Astola. 2012. Audio based solutions for detecting intruders in wild areas. Signal Processing 92, 3 (2012), 829–840.
[46]
N’tcho Assoukpou Jean Gnamélé, Yelakan Berenger Ouattara, Tokpa Arsene Kobea, Geneviève Baudoin, and Jean-Marc Laheurte. 2019. KNN and SVM classification for chainsaw identification in the forest areas. International Journal of Advanced Computer Science and Applications 10, 12 (2019), 531–536.
[47]
Alex Graves. 2012. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, Vol. 385. Springer, 37–45.
[48]
Andrey Guzhov, Federico Raue, Jörn Hees, and Andreas Dengel. 2021. ESResNet: Environmental sound classification based on visual domain models. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, Los Alamitos, CA, 4933–4940.
[49]
Osman Günay, Kasım Taşdemir, B. Uğur Töreyin, and A. Enis Çetin. 2009. Video based wildfire detection at night. Fire Safety Journal 44, 6 (2009), 860–868.
[50]
Shawn Hershey, Sourish Chaudhuri, Daniel P. W. Ellis, Jort F. Gemmeke, Aren Jansen, R. Channing Moore, Manoj Plakal, Devin Platt, Rif A. Saurous, Bryan Seybold, Malcolm Slaney, Ron J. Weiss, and Kevin Wilson. 2017. CNN architectures for large-scale audio classification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’17). IEEE, Los Alamitos, CA, 131–135.
[51]
Jonas Höchst, Hicham Bellafkir, Patrick Lampe, Markus Vogelbacher, Markus Mühling, Daniel Schneider, Kim Lindner, Sascha Rösner, Dana G. Schabo, Nina Farwig, and Bernd Freisleben. 2022. Bird@Edge: Bird species recognition at the edge. In Proceedings of the 10th International Conference on Networked Systems (NETYS’22). 69–86.
[52]
Zilong Huang, Chen Liu, Hongbo Fei, Wei Li, Jinghu Yu, and Yi Cao. 2020. Urban sound classification based on 2-order dense convolutional network using dual features. Applied Acoustics 164 (2020), 107243.
[53]
Muhammad Huzaifah. 2017. Comparison of time-frequency representations for environmental sound classification using convolutional neural networks. arXiv abs/1706.07156 (2017).
[54]
Rakib Hyder, Shabnam Ghaffarzadegan, Zhe Feng, John Hansen, and Taufiq Hasan. 2017. Acoustic scene classification using a CNN-supervector system trained with auditory and spectrogram image features. In Proceedings of Interspeech 2017. 3073–3077.
[55]
Christian Janiesch, Patrick Zschech, and Kai Heinrich. 2021. Machine learning and deep learning. Electron Markets 31, 31 (2021), 685–695.
[56]
Gayan Kalhara, Vishan Jayasinghearachchi, Achala Dias, Vishwa Ratnayake, Chandimal Jayawardena, and Nuwan Kuruwitaarachchi. 2017. TreeSpirit: Illegal logging detection and alerting system using audio identification over an IoT network. In Proceedings of the 2017 11th International Conference on Software, Knowledge, Information Management, and Applications (SKIMA’17). IEEE, Los Alamitos, CA, 1–7.
[57]
Kalyanaswamy Banuroopa and Shanmuga Priyaa. 2022. MFCC based hybrid fingerprinting method for audio classification through LSTM. International Journal of Nonlinear Analysis and Applications 12 (2022), 2125–2136.
[58]
Jaehun Kim, Kyoungin Noh, Jaeha Kim, and Joon-Hyuk Chang. 2018. Sound event detection based on beamformed convolutional neural network using multi-microphones. In Proceedings of the 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC’18). IEEE, Los Alamitos, CA, 170–173.
[59]
Tomoya Koike, Kun Qian, Qiuqiang Kong, Mark D. Plumbley, Björn W. Schuller, and Yoshiharu Yamamoto. 2020. Audio for audio is better? An investigation on transfer learning models for heart sound classification. In Proceedings of the 42nd Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE, Los Alamitos, CA, 74–77.
[60]
Stanislaw Komorowski Dariusz, Pietraszek. 2015. The use of continuous wavelet transform based on the fast Fourier transform in the analysis of multi-channel electrogastrography recordings. Journal of Medical Systems 40 (2015), 10.
[61]
Qiuqiang Kong, Turab Iqbal, Yong Xu, Wenwu Wang, and Mark D. Plumbley. 2018. DCASE challenge surrey cross-task convolutional neural network baseline. In Detection and Classification of Acoustic Scenes and Events. DCASE.
[62]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2017. ImageNet classification with deep convolutional neural networks. Communications of the ACM 60, 6 (2017), 84–90.
[63]
Anurag Kumar and Vamsi Krishna Ithapu. 2020. A sequential self teaching approach for improving generalization in sound event recognition. In Proceedings of the 37th International Conference on Machine Learning. 5447–5457.
[64]
Ianis Lallemand, Diemo Schwarz, and Thierry Artières. 2012. Content-based retrieval of environmental sounds by multiresolution analysis. In Proceedings of Content-Based Retrieval of Environmental Sounds by Multiresolution Analysis (SMC’12).1.
[65]
Mario Lasseck. 2018. Audio-based bird species identification with deep convolutional neural networks. In Proceedings of the Working Notes of CLEF. 2125.
[66]
Iurii Lezhenin, Natalia Bogach, and Evgeny Pyshkin. 2019. Urban sound classification using long short-term memory neural network. In Proceedings of the 14th Federated Conference on Computer Science and Information Systems. IEEE, Los Alamitos, CA, 57–60.
[67]
Juncheng Billy Li, Shuhui Qu, Po-Yao Huang, and Florian Metze. 2022. AudioTagging done right: 2nd comparison of deep learning methods for environmental sound classification. arXiv abs/2203.13448 (2022).
[68]
Ying Li and Zhibin Wu. 2015. Animal sound recognition based on double feature of spectrogram in real environment. In Proceedings of the 2015 International Conference on Wireless Communications and Signal Processing (WCSP’15). IEEE, Los Alamitos, CA, 1–5.
[69]
Aswathy Madhu and Suresh Kirthi Kumaraswamy. 2021. EnvGAN: Adversarial synthesis of environmental sounds for data augmentation. arXiv abs/2104.07326 (2021).
[70]
Alina-Elena Marcu, George Suciu, Elena Olteanu, Delia Miu, Alexandru Drosu, and Ioana Marcu. 2019. IoT system for forest monitoring. In Proceedings of the 42nd International Conference on Telecommunications and Signal Processing (TSP’19). IEEE, Los Alamitos, CA, 629–632.
[71]
Lucas Martin Wisniewski, Jean-Michel Bec, Guillaume Boguszewski, and Abdoulaye Gamatié. 2022. Hardware solutions for low-power smart edge computing. Journal of Low Power Electronics and Applications 12, 4 (2022), 61.
[72]
Brian McFee, Colin Raffel, Dawen Liang, Daniel Ellis, Matt McVicar, Eric Battenberg, and Oriol Nieto. 2015. librosa: Audio and music signal analysis in Python. In Proceedings of the 14th Python in Science Conference. 18–24.
[73]
Dulani Meedeniya. 2023. Deep Learning: A Beginners’ Guide. CRC Press, Boca Raton, FL. https://books.google.lk/books?id=PiirzwEACAAJ
[74]
Aska Mehyadin, Adnan Mohsin Abdulazeez, Dathar Abas Hasan, and Jwan Saeed. 2021. Birds sound classification based on machine learning algorithms. Asian Journal of Research in Computer Science 9 (2021), 1–11.
[75]
Massimo Merenda, Carlo Porcaro, and Demetrio Iero. 2020. Edge machine learning for AI-enabled IoT devices: A review. Sensors 20, 9 (2020), 2533.
[76]
B. Mishachandar and S. Vairamuthu. 2021. Diverse ocean noise classification using deep learning. Applied Acoustics 181 (2021), 108141.
[77]
Md. Mohaimenuzzaman, Christoph Bergmeir, and Bernd Meyer. 2022. Pruning vs XNOR-Net: A comprehensive study of deep learning for audio classification on edge-devices. IEEE Access 10 (2022), 6696–6707.
[78]
Md. Mohaimenuzzaman, Christoph Bergmeir, Ian West, and Bernd Meyer. 2023. Environmental sound classification on the edge: A pipeline for deep acoustic networks on extremely resource-constrained devices. Pattern Recognition 133 (2023), 109025.
[79]
Iosif Mporas, Isidoros Perikos, Vasilios Kelefouras, and Michael Paraskevas. 2020. Illegal logging detection based on acoustic surveillance of forest. Applied Sciences 10, 20 (2020), 1–12.
[80]
Seongkyu Mun, Sangwook Park, David K. Han, and Hanseok Ko. 2017. Generative adversarial network based acoustic scene training set augmentation and selection using SVM hyperplane. In Proceedings of the Detection and Classification of Acoustic Scenes and Events Workshop (DCASE’17). 93–102.
[81]
Andrés Muñoz. 2014. Machine learning and optimization. Courant Institute of Mathematical Sciences 2014 (2012), 1–2.
[82]
Zohaib Mushtaq and Shun-Feng Su. 2020. Efficient classification of environmental sounds through multiple features aggregation and data enhancement techniques for spectrogram images. Symmetry 12, 11 (2020), 1822.
[83]
Zohaib Mushtaq, Shun-Feng Su, and Quoc-Viet Tran. 2021. Spectral images based environmental sound classification using CNN with meaningful data augmentation. Applied Acoustics 172 (2021), 107581.
[84]
Satoshi Nakamura, Kazuo Hiyane, Futoshi Asano, and Takashi Endo. 1999. Sound scene data collection in real acoustical environments. Journal of the Acoustical Society of Japan (E) 20, 3 (1999), 225–231.
[85]
Loris Nanni, Gianluca Maguolo, Sheryl Brahnam, and Michelangelo Paci. 2021. An ensemble of convolutional neural networks for audio classification. Applied Sciences 11, 13 (2021), 5796.
[86]
Loris Nanni, Gianluca Maguolo, and Michelangelo Paci. 2020. Data augmentation approaches for improving animal audio classification. arXiv abs/1912.07756 (2020).
[87]
Alireza Nasiri and Jianjun Hu. 2021. SoundCLR: Contrastive learning of representations for improved environmental sound classification. arXiv abs/2103.01929 (2021).
[88]
Masaki Okawa, Takuya Saito, Naoki Sawada, and Hiromitsu Nishizaki. 2019. Audio classification of bit-representation waveform. In Proceedings of Interspeech 2019. 2553–2557.
[89]
Elena Olteanu, Victor Suciu, Svetlana Segarceanu, Ioana Petre, and Andrei Scheianu. 2018. Forest monitoring system through sound recognition. In Proceedings of the International Conference on Communications (COMM’18). IEEE, Los Alamitos, CA, 75–80.
[90]
Heshan Padmasiri, Jithmi Shashirangana, Dulani Meedeniya, Omer Rana, and Charith Perera. 2022. Automated license plate recognition for resource-constrained environments. Sensors 22, 4 (2022), 1434.
[91]
Kamalesh Palanisamy, Dipika Singhania, and Angela Yao. 2020. Rethinking CNN models for audio classification. arXiv abs/2007.11154 (2020).
[92]
Sinno Jialin Pan and Qiang Yang. 2010. A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering 22, 10 (2010), 1345–1359.
[93]
Yagya Raj Pandeya, Dongwhoon Kim, and Joonwhoan Lee. 2018. Domestic cat sound classification using learned features from deep neural nets. Applied Sciences 8, 10 (2018), 1949.
[94]
Ning Peng, Aibin Chen, Guoxiong Zhou, Wenjie Chen, Wenzhuo Zhang, Jing Liu, and Fubo Ding. 2020. Environment sound classification based on visual multi-feature fusion and GRU-AWS. IEEE Access 8 (2020), 191100–191114.
[95]
Karol J. Piczak. 2015. Environmental sound classification with convolutional neural networks. In Proceedings of the 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP’15). IEEE, Los Alamitos, CA, 1–6.
[96]
Karol J. Piczak. 2015. ESC-50: Dataset for Environmental Sound Classification. Retrieved July 20, 2022 from https://github.com/karolpiczak/ESC-50
[97]
Karol J. Piczak. 2015. ESC: Dataset for environmental sound classification. In Proceedings of the 23rd ACM International Conference on Multimedia. ACM, New York, NY, 1015–1018.
[98]
Anupriya Prasad and Pradeep Chawda. 2018. Power management factors and techniques for IoT design devices. In Proceedings of the 2018 19th International Symposium on Quality Electronic Design (ISQED’18). IEEE, Los Alamitos, CA, 364–369.
[99]
Dirga Chandra Prasetyo, Giva Andriana Mutiara, and Rini Handayani. 2018. Chainsaw sound and vibration detector system for illegal logging. In Proceedings of the 2018 International Conference on Control, Electronics, Renewable Energy, and Communications (ICCEREC’18). IEEE, Los Alamitos, CA, 93–98.
[100]
Hendrik Purwins, Bo Li, Tuomas Virtanen, Jan Schlüter, Shuo-Yiin Chang, and Tara Sainath. 2019. Deep learning for audio signal processing. IEEE Journal of Selected Topics in Signal Processing 13, 2 (2019), 206–219.
[101]
Mohammad Rastegari, Vincente Ordonez, Joseph Redmon, and Ali Farhadi. 2016. NOR-Net: ImageNet classification using binary convolutional neural networks. In Proceedings of the 14th European Conference on Computer Vision (ECCV’16). 525–542.
[102]
Imran Mohammed Safwat, Rahman Afia Fahmida, Sifat Tanvi, Kadir Hamim Hassan, Iqbal Junaid, and Mostakim Moin. 2021. An analysis of audio classification techniques using deep learning architectures. In Proceedings of the 2021 6th International Conference on Inventive Computation Technologies (ICICT’21). IEEE, Los Alamitos, CA, 805–812.
[103]
Justin Salamon, Christopher Jacoby, and Juan Pablo Bello. 2014. A dataset and taxonomy for urban sound research. In Proceedings of the 22nd ACM International Conference on Multimedia. ACM, New York, NY, 1041–1044.
[104]
Justin Salamon, Christopher Jacoby, and Juan Pablo Bello. 2014. UrbanSound8k. Retrieved July 20, 2022 from https://urbansounddataset.weebly.com/urbansound8k.html
[105]
Justin Salamon, Duncan MacConnell, Mark Cartwright, Peter Li, and Juan Pablo Bello. 2017. Scaper: A library for soundscape synthesis and augmentation. In Proceedings of the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA’17). IEEE, Los Alamitos, CA, 344–348.
[106]
Sina Sanaei, Babak Majidi, and Ehsan Akhtarkavan. 2018. Deep multisensor dashboard for composition layer of Web of Things in the smart city. In Proceedings of the 9th International Symposium on Telecommunications. IEEE, Los Alamitos, CA, 211–215.
[107]
Svetlana Segarceanu, Elena Olteanu, and George Suciu. 2020. Forest monitoring using forest sound identification. In Proceedings of the 2020 43rd International Conference on Telecommunications and Signal Processing (TSP’20). IEEE, Los Alamitos, CA, 346–349.
[108]
Svetlana Segarceanu, George Suciu, and Inge Gavat. 2021. Neural networks for automatic environmental sound recognition. In Proceedings of the 2021 International Conference on Speech Technology and Human-Computer Dialogue (SpeD’21). IEEE, Los Alamitos, CA, 7–12.
[109]
Romain Serizel, Nicolas Turpault, Ankit Shah, and Justin Salamon. 2020. Sound event detection in synthetic domestic environments. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’20). IEEE, Los Alamitos, CA, 86–90.
[110]
Sulis Setiowati, Zulfanahri, Eka Legya Franita, and Igi Ardiyanto. 2017. A review of optimization method in face recognition: Comparison deep learning and non-deep learning methods. In Proceedings of the 9th International Conference on Information Technology and Electrical Engineering (ICITEE’17). IEEE, Los Alamitos, CA, 1–6.
[111]
Sayed Khushal Shah, Zeenat Tariq, and Yugyung Lee. 2019. IoT based urban noise monitoring in deep learning using historical reports. In Proceedings of the IEEE International Conference on Big Data. IEEE, Los Alamitos, CA, 4179–4184.
[112]
Roneel V. Sharan and Tom J. Moir. 2015. Noise robust audio surveillance using reduced spectrogram image feature and one-against-all SVM. Neurocomputing 158 (2015), 90–99.
[113]
Garima Sharma, Kartikeyan Umapathy, and Sridhar Krishnan. 2020. Trends in audio signal feature extraction methods. Applied Acoustics 158 (2020), 107020.
[114]
Jithmi Shashirangana, Heshan Padmasiri, Dulani Meedeniya, Charith Perera, Soumya R. Nayak, Janmenjoy Nayak, Shanmuganthan Vimal, and Seifidine Kadry. 2021. License plate recognition using neural architecture search for edge devices. International Journal of Intelligent Systems 36, 7 (2021), 1–38.
[115]
Sungho Shin, Jongwon Kim, Yeonguk Yu, Seongju Lee, and Kyoobin Lee. 2021. Self-supervised transfer learning from natural images for sound classification. Applied Sciences 11, 7 (2021), 3043.
[116]
Siddharth Sigtia, Adam M. Stark, Sacha Krstulovic, and Mark D. Plumbley. 2016. Automatic environmental sound recognition: Performance versus computational cost. IEEE/ACM Transactions on Audio, Speech, and Language Processing 24, 11 (2016), 2096–2107.
[117]
Rajesh Singh, Anita Gehlot, Shaik Vaseem Akram, Amit Kumar Thakur, Dharam Buddhi, and Prabin Kumar Das. 2021. Forest 4.0: Digitalization of forest using the Internet of Things (IoT). Journal of King Saud University—Computer and Information Sciences 34, 8 (2021), 5587–5601.
[118]
Marina Sokolova, Nathalie Japkowicz, Stan Szpakowicz, Abdul Sattar, and Byeong-Ho Kang. 2006. Beyond accuracy, F-score and ROC: A family of discriminant measures for performance evaluation. In AI: Advances in Artificial Intelligence. Springer, Berlin, Germany, 1015–1021.
[119]
MathWorks Inc. 2019. Continuous Wavelet Transform and Scale-Based Analysis. Retrieved July 20, 2022 from https://www.mathworks.com/help/wavelet/gs/continuous-wavelet-transform-and-scale-based-analysis.html
[120]
Po-Jung Ting, Shanq-Jang Ruan, and Lieber Po-Hung Li. 2021. Environmental noise classification with inception-dense blocks for hearing aids. Sensors 21, 16 (2021), 5406.
[121]
Achyut Mani Tripathi and Aakansha Mishra. 2021. Self-supervised learning for Environmental Sound Classification. Applied Acoustics 182 (2021), 108183.
[122]
Nicolas Turpault, Romain Serizel, Scott Wisdom, Hakan Erdogan, John R. Hershey, Eduardo Fonseca, Prem Seetharaman, and Justin Salamon. 2021. Sound event detection and separation: A benchmark on DESED synthetic soundscapes. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing. IEEE, Los Alamitos, CA, 840–844.
[123]
Jurgen Vandendriessche, Nick Wouters, Bruno da Silva, Mimoun Lamrini, Mohamed Yassin Chkouri, and Abdellah Touhafi. 2021. Environmental sound recognition on embedded systems: From FPGAs to TPUs. Electronics 10, 21 (2021), 2622.
[124]
Mário P. Véstias, Rui Policarpo Duarte, José T. de Sousa, and Horácio C. Neto. 2020. Moving deep learning to the edge. Algorithms 13, 5 (2020), 125.
[125]
H. L. Wang, D. Z. Song, Z. L. Li, X. Q. He, S. R. Lan, and H. F. Guo. 2020. Acoustic emission characteristics of coal failure using automatic speech recognition methodology analysis. International Journal of Rock Mechanics and Mining Sciences 136 (2020), 104472.
[126]
Zhi Wang, Wentao Zha, Jin Chai, Yilin Liu, and Zhuoling Xiao. 2021. Lightweight implementation of FPGA-based environmental sound recognition system. In Proceedings of the International Conference on UK-China Emerging Technologies (UCET’21). IEEE, Los Alamitos, CA, 59–66.
[127]
Shengyun Wei, Shun Zou, Feifan Liao, and Weiman Lang. 2020. A comparison on data augmentation methods based on deep learning for audio classification. Journal of Physics: Conference Series 1453, 1 (2020), 012085.
[128]
Felix Weninger and Björn Schuller. 2011. Audio recognition in the wild: Static and dynamic classification on a real-world database of animal vocalizations. In Proceedings of the 2011 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’11). IEEE, Los Alamitos, CA, 337–340.
[129]
Zhizheng Wu and Simon King. 2016. Investigating gated recurrent networks for speech synthesis. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’16). IEEE, Los Alamitos, CA, 5140–5144.
[130]
Steven Wyatt, David Elliott, Akshay Aravamudan, Carlos E. Otero, Luis D. Otero, Georgios C. Anagnostopoulos, Anthony O. Smith, Adrian M. Peter, Wesley Jones, Steven Leung, and Eric Lam. 2021. Environmental sound classification with tiny transformers in noisy edge environments. In Proceedings of the IEEE 7th World Forum on Internet of Things (WF-IoT’21). IEEE, Los Alamitos, CA, 309–314.
[131]
Nina Sofia Wyniawskyj, Milena Napiorkowska, David Petit, Pritimoy Podder, and Paula Marti. 2019. Forest monitoring in Guatemala using satellite imagery and deep learning. In Proceedings of the IEEE International Geoscience and Remote Sensing Symposium (IGARSS’19). IEEE, Los Alamitos, CA, 6598–6601.
[132]
Jie Xie, Kai Hu, Mingying Zhu, Jinghu Yu, and Qibing Zhu. 2019. Investigation of different CNN-based models for improved bird sound classification. IEEE Access 7 (2019), 175353–175361.
[133]
Lidong Yang, Jiangtao Hu, and Zhuangzhuang Zhang. 2019. Audio scene classification based on gated recurrent unit. In Proceedings of the IEEE International Conference on Signal, Information, and Data Processing. IEEE, Los Alamitos, CA, 1–5.
[134]
Jiaxing Ye, Takumi Kobayashi, Nobuyuki Toyama, Hiroshi Tsuda, and Masahiro Murakawa. 2018. Acoustic scene classification using efficient summary statistics and multiple spectro-temporal descriptor fusion. Applied Sciences 8, 8 (2018), 1363.
[135]
Yuzhong Wu and Tan Lee. 2018. Reducing model complexity for DNN based large-scale audio classification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’18). IEEE, Los Alamitos, CA, 331–335.
[136]
Shuo Zhang, Demin Gao, Haifeng Lin, and Quan Sun. 2019. Wildfire detection using sound spectrum analysis based on the Internet of Things. Sensors 19, 23 (2019), 5093.
[137]
Sai-Hua Zhang, Zhao Zhao, Zhi-Yong Xu, Kristen Bellisario, and Bryan C. Pijanowski. 2018. Automatic bird vocalization identification based on fusion of spectral pattern and texture features. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’18). IEEE, Los Alamitos, CA, 271–275.
[138]
Zhao Zhao, Sai-Hua Zhang, Zhi-Yong Xu, Kristen Bellisario, Nian-Hua Dai, Hichem Omrani, and Bryan C. Pijanowski. 2017. Automated bird acoustic event detection and robust species classification. Ecological Informatics 39 (2017), 99–108.
[139]
Pablo Zinemanas, Martín Rocamora, Marius Miron, Frederic Font, and Xavier Serra. 2021. An interpretable deep learning model for automatic sound classification. Electronics 10, 7 (2021), 850.
[140]
Imran Zualkernan, Jacky Judas, Taslim Mahbub, Azadan Bhagwagar, and Priyanka Chand. 2021. An AIoT system for bat species classification. In Proceedings of the IEEE International Conference on Internet of Things and Intelligence System (IoTaIS’21). IEEE, Los Alamitos, CA, 155–160.
[141]
Incze Ágnes, Henrietta-Bernadett Jancsó, Szilágyi Zoltán, Farkas Attila, and Sulyok Csaba. 2018. Bird sound recognition using a convolutional neural network. In Proceedings of the IEEE 16th International Symposium on Intelligent Systems and Informatics (SISY’18). IEEE, Los Alamitos, CA, 295–300.

Cited By

View all
  • (2024)A Comparative Study of Preprocessing and Model Compression Techniques in Deep Learning for Forest Sound ClassificationSensors10.3390/s2404114924:4(1149)Online publication date: 9-Feb-2024
  • (2024)Hardware-aware Neural Architecture Search for Sound Classification in Constrained Environments2024 International Research Conference on Smart Computing and Systems Engineering (SCSE)10.1109/SCSE61872.2024.10550556(1-6)Online publication date: 4-Apr-2024
  • (2024)Comparison of Hand-Crafted and Deep Features Towards Explainable AI at the Edge for Analysis of Audio Scenes2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP)10.1109/MLSP58920.2024.10734782(1-6)Online publication date: 22-Sep-2024
  • Show More Cited By

Index Terms

  1. A Survey on Deep Learning Based Forest Environment Sound Classification at the Edge

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 56, Issue 3
      March 2024
      977 pages
      EISSN:1557-7341
      DOI:10.1145/3613568
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 05 October 2023
      Online AM: 01 September 2023
      Accepted: 21 August 2023
      Revised: 31 May 2023
      Received: 02 August 2022
      Published in CSUR Volume 56, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Sound processing
      2. edge computing
      3. deep learning
      4. artificial intelligence
      5. Internet of Things

      Qualifiers

      • Survey

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)565
      • Downloads (Last 6 weeks)72
      Reflects downloads up to 30 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Comparative Study of Preprocessing and Model Compression Techniques in Deep Learning for Forest Sound ClassificationSensors10.3390/s2404114924:4(1149)Online publication date: 9-Feb-2024
      • (2024)Hardware-aware Neural Architecture Search for Sound Classification in Constrained Environments2024 International Research Conference on Smart Computing and Systems Engineering (SCSE)10.1109/SCSE61872.2024.10550556(1-6)Online publication date: 4-Apr-2024
      • (2024)Comparison of Hand-Crafted and Deep Features Towards Explainable AI at the Edge for Analysis of Audio Scenes2024 IEEE 34th International Workshop on Machine Learning for Signal Processing (MLSP)10.1109/MLSP58920.2024.10734782(1-6)Online publication date: 22-Sep-2024
      • (2024)Identification of Sounds Using Deep Learning with MFCC Features Extraction2024 IEEE AITU: Digital Generation10.1109/IEEECONF61558.2024.10585335(60-64)Online publication date: 3-Apr-2024
      • (2024)Assessing the Complexity and Real-Time Performance of Anomaly Detection Algorithms in Resource-Constrained Environments2024 IEEE 20th International Conference on Intelligent Computer Communication and Processing (ICCP)10.1109/ICCP63557.2024.10793006(1-8)Online publication date: 17-Oct-2024
      • (2024)AESHML: An Automatic Editing Method for Soccer Match Highlights Using Multimodal LearningIEEE Access10.1109/ACCESS.2024.345880312(129967-129974)Online publication date: 2024
      • (2024)Assessing the affective quality of soundscape for individuals: Using third-party assessment combined with an artificial intelligence (TPA-AI) modelScience of The Total Environment10.1016/j.scitotenv.2024.176083953(176083)Online publication date: Nov-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media