Abstract
Automatic detection of calling bird species is advantageous for monitoring the environment on a broad scale, both temporally and spatially. Numerous investigations have been influenced by feature representations employed in the field of automatic voice recognition. In this study, we investigated deep neural networks on a dataset of 12,061 files for voice recognition for 22 bird species. The methodology adopted in the current study deviates from the existing approaches by integrating transfer learning. Also, multiple feature extraction techniques have been used to extract features from audio to analyze bird sounds, including the Fourier Transform, Mel-Spectrogram, and Mel Frequency Cepstral Coefficients. The study’s main objective is to develop intelligent systems that can predict the different species of bird from the collected set of audio data recordings. The current work verifies that deep transfer learning models like ResNet50, DenseNet201, InceptionV3, Xception and Efficient Net can effectively extract and recognize the audio signals from different bird species with significant prediction accuracy. The absolute best classification accuracy is 97.43%, which DenseNet201 and ResNet50 classification model attained on validation set. Also, DenseNet201 incurred least validation loss (0.1080). The Xception model performed best with the training data and achieved 100% training accuracy and incurred least loss (0.0011). Thus, our study brings us a solution to quantify/test deep learning models appropriately.



















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Availability of data and material
Not applicable.
References
Acevedo MA, Corrada-Bravo CJ, Corrada-Bravo H, Villanueva-Rivera LJ, Aide TM (2009) Automated classification of bird and amphibian calls using machine learning: a comparison of methods. Eco Inform 4(4):206–214. https://doi.org/10.1016/j.ecoinf.2009.06.005
Bang AV, Rege PP (2017) Recognition of bird species from their sounds using data reduction techniques. In: ACM international conference proceeding series, pp 111–116.
Bao L, Cui Y (2005) Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics 21(10):2185–2190. https://doi.org/10.1093/bioinformatics/bti365
Benetos E, Stowell D, Plumbley MD (2018) Approaches to complex sound scene analysis. In: Virtanen T, Plumbley MD, Ellis D (eds) Computational analysis of sound scenes and events. Springer, Cham, pp 215–242. https://doi.org/10.1007/978-3-319-63450-0
Briggs F, Raich R, Fern XZ (2009) Audio classification of bird species: a statistical manifold approach. Proc Int Conf Data Min ICDM. https://doi.org/10.1109/ICDM.2009.65
Cai J, Ee D, Pham B, Roe P, Zhang J (2007) Sensor network for the monitoring of ecosystem: bird species recognition. In: Proceedings of the 2007 international conference on intelligent sensors, sensor networks and information processing, ISSNIP, pp 293–298. https://doi.org/10.1109/ISSNIP.2007.4496859
Cakir E, Adavanne S, Parascandolo G, Drossos K, Virtanen T (2017) Convolutional recurrent neural networks for bird audio detection. In: Signal processing conference (EUSIPCO), 2017 25th European. IEEE, pp 1744–1748. https://doi.org/10.23919/eusipco.2017.8081508
Incze Á, Jancsó HB, Szilagyi Z, Farkas A, Sulyok C (2018) Bird sound recognition using a convolutional neural network. In: SISY 2018—IEEE 16th international symposium on intelligent systems and informatics, proceedings, September 2018, pp 295–300. https://doi.org/10.1109/SISY.2018.8524677
Jain N, Gupta V, Shubham S, Madan A, Chaudhary A, Santosh KC (2021) Understanding cartoon emotion using integrated deep neural network on large dataset. Neural Comput Appl. https://doi.org/10.1007/s00521-021-06003-9
Jancovic P, Kkuer M (2011) Automatic detection and recognition of tonal bird sounds in noisy environments. Eurasip J Adv Signal Process. https://doi.org/10.1155/2011/982936
Kahl S, Wilhelm-Stein T, Hussein H, Klinck H, Kowerko D, Ritter M, Eibl M (2017) Large-scale bird sound classification using convolutional neural networks. CEUR workshop proceedings, 1866
Koops HV, Van Balen J, Wiering F (2014) A deep neural network approach to the LifeCLEF 2014 bird task. CEUR Workshop Proceedings, vol 1180, pp 634–642
Koops HV, Van Balen J, Wiering F (2015) Automatic segmentation and deep learning of bird sounds. Lecture notes in computer science (including subseries lecture notes in artificial intelligence and lecture notes in bioinformatics), vol 9283, pp 261–267. https://doi.org/10.1007/978-3-319-24027-5_26
Kumar Y, Singh N (2017) An automatic speech recognition system for spontaneous Punjabi speech corpus. Int J Speech Technol 20:297–303. https://doi.org/10.1007/s10772-017-9408-2
Kumar Y, Singh N, Kumar M, Singh A (2020) AutoSSR: an efficient approach for automatic spontaneous speech recognition model for the Punjabi Language. Soft Comput 25(2):1617–1630
Kumar Y, Kaur K, Kaur S (2021) Study of automatic text summarization approaches in different languages. Artif Intell Rev. https://doi.org/10.1007/s10462-021-09964-4
Kumar Y, Singh N (2019) A comprehensive view of automatic speech recognition system—a systematic literature review. In: International conference on automation, computational and technology management (ICACTM), pp 168–173. https://doi.org/10.1109/ICACTM.2019.8776714
Lee C-H, Lee Y-K, Huang R-Z (2006) Automatic recognition of bird songs using Cepstral coefficients. J Inf Technol Appl 1(1):17–23
Matsubayashi S, Suzuki R, Saito F, Murate T, Masuda T, Yamamoto K, Okuno HG (2017) Acoustic monitoring of the great reed warbler using multiple microphone arrays and robot audition. J Robot Mechatron 29:224–235. https://doi.org/10.20965/jrm.2017.p0224
Mehyadin AE, Abdulazeez AM, Hasan DA, Saeed JN (2021) Birds sound classification based on machine learning algorithms. Asian J Res Comput Sci, pp 1–11
Mhatre TM, Bhattacharjee S (2018) Birds voice classification using ResNet. Int J Eng Develop Res 6(4):2321–9939
Mohanty R, Kumar Mallik B, Singh Solanki S (2020) Recognition of bird species based on spike model using bird dataset. Data Brief 29:105301. https://doi.org/10.1016/j.dib.2020.105301
Morfi V, Stowell D (2017) Deductive refinement of species labelling in weakly labelled birdsong recordings. In: Proceedings of ICASSP 2017, pp 656–660. IEEE. https://doi.org/10.1109/icassp.2017.7952237
Pamuła H, Klaczynski M, Remisiewicz M, Wszolek W, Stowell D (2017) Adaptation of deep learning methods to nocturnal bird audio monitoring. In: LXIV open seminar on acoustics (OSA) 2017, Piekary Slaskie, Poland
Pellegrini T (2017) Densely connected CNNs for bird audio detection. In: Proceedings of EUSIPCO 2017, pp 1734–1738. https://doi.org/10.23919/eusipco.2017.8081506
Piczak KJ (2016) Recognizing bird species in audio recordings using deep convolutional neural networks. In: CLEF working notes. Springer, Cham, Switzerland, pp 534–543
Qian K, Zhang Z, Baird A, Schuller B (2017) Active learning for bird sound classification via a kernel-based extreme learning machine. J Acoust Soc Am 142(4):1796–1804. https://doi.org/10.1121/1.5004570
Qian K, Zhang Z, Ringeval F, Schuller B (2015) Bird sounds classification by large scale acoustic features and extreme learning machine. in Proceedings of GlobalSIP, IEEE, Orlando, FL, pp 1317–1321
Scott Brandes T (2008) Automated sound recording and analysis techniques for bird surveys and conservation. Bird Conserv Int 18(S1):S163–S173. https://doi.org/10.1017/S0959270908000415
Shriharsha, Tushara, Hemavathi (2020) Bird species classification using Deep learning approach. Int Res J Eng Technol, pp 6030–6033
Sprengel E, Jaggi M, Kilcher Y, Hofmann T (2016) Audio based bird species identification using deep learning techniques. CEUR Workshop Proc 1609:547–559
Stastny J, Munk M, Juranek L (2018) Automatic bird species recognition based on birds vocalization. Eurasip J Audio Speech Music Process. https://doi.org/10.1186/s13636-018-0143-7
Stowell D, Plumbley MD (2014) Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning. PeerJ. https://doi.org/10.7717/peerj.488
Stowell D, Wood MD, Pamuła H, Stylianou Y, Glotin H (2019) Automatic acoustic detection of birds through deep learning: the first bird audio detection challenge. Methods Ecol Evol 10(3):368–380. https://doi.org/10.1111/2041-210X.13103
Supriya PR, Bhat S, Shivani SS (2018) Classification of birds based on their sound patterns using GMM and SVM classifiers. Int Res J Eng Technol 05(2004):4708–4711
Tan LN, Alwan A, Kossan G, Cody ML, Taylor CE (2015) Dynamictime warping and sparse representation classification for birdsong phrase classification using limited training data”. J Acoust Soc Am 137(3):1069–1080
Thakur A, Jyothi R, Padmanabhan Rajan AD (2017) Rapid bird activity detection using probabilistic sequence kernels. In: Proceedings of EUSIPCO 2017, pp 1754–1758
Xie J, Zhu M (2019) Handcrafted features and late fusion with deep learning for bird sound classification. Eco Inform 52(May):74–81. https://doi.org/10.1016/j.ecoinf.2019.05.007
Yu H, Sun C, Yang W, Yang X, Zuo X (2015) Al-elm: one uncertaintybased active learning algorithm using extreme learning machine. Neurocomputing 166:140–150
Zhang Z, Schuller B (2012) Active learning by sparse instance tracking and classifier confidence in acoustic emotion recognition. In: Proceedings of INTERSPEECH, ISCA, Portland, OR (2012), pp 362–365
Funding
Not applicable.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Code availability
Not applicable.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Kumar, Y., Gupta, S. & Singh, W. A novel deep transfer learning models for recognition of birds sounds in different environment. Soft Comput 26, 1003–1023 (2022). https://doi.org/10.1007/s00500-021-06640-1
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-021-06640-1