Skip to main content

Automatic Identification of Bird Species from Audio

  • Conference paper
  • First Online:
Intelligent Information and Database Systems (ACIIDS 2021)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12672))

Included in the following conference series:

  • 1787 Accesses

Abstract

Bird species identification is a relevant and time-consuming task for ornithologists and ecologists. With growing amounts of audio annotated data, automatic bird classification using machine learning techniques is an important trend in the scientific community. Analyzing bird behavior and population trends helps detect other organisms in the environment and is an important problem in ecology. Bird populations react quickly to environmental changes, which makes their real time counting and tracking challenging and very useful. A reliable methodology that automatically identifies bird species from audio would therefore be a valuable tool for the experts in different scientific and applicational domains.

The goal of this work is to propose a methodology able to identify bird species by its chirp. In this paper we explore deep learning techniques that are being used in this domain, such as Convolutional Neural Networks and Recurrent Neural Networks to classify the data. In deep learning, audio problems are commonly approached by converting them into images using audio feature extraction techniques such as Mel Spectrograms and Mel Frequency Cepstral Coefficients. We propose and test multiple deep learning and feature extraction combinations in order to find the most suitable approach to this problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Martinsson, J.: Bird Species Identification using Convolutional Neural Networks. Ph.D. thesis (2017). https://odr.chalmers.se/handle/20.500.12380/249467

  2. Gavali, P., et al.: Bird species identification using deep learning. Int. J. Eng. Res. Technol. 8(4) (2019). ISSN 2278-0181. https://www.ijert.org/bird-species-identification-using-deep-learning

  3. Boddapati, V., et al.: Classifying environmental sounds using image recognition networks. Procedia Comput. Sci. 112, 2048–2056 (2017). https://doi.org/10.1016/j.procs.2017.08.250. ISSN 1877-0509

    Article  Google Scholar 

  4. Huang, C.-J., et al.: Frog classification using machine learning techniques. Expert Syst. Appl. 36(2), 3737–3743 (2009). https://doi.org/10.1016/j.eswa.2008.02.059. ISSN 0957-4174

    Article  Google Scholar 

  5. Colonna, J., et al.: Automatic classification of anuran sounds using convolutional neural networks. In: ResearchGate, pp. 73–78 (2016). https://doi.org/10.1145/2948992.2949016

  6. Fagerlund, Seppo: Bird species recognition using support vector machines. EURASIP J. Adv. Signal Process. 2007(1), 1–8 (2007). https://doi.org/10.1155/2007/38637. ISSN 1687-6180

    Article  MATH  Google Scholar 

  7. Wielgat, R., et al.: HFCC based recognition of bird species. In: Signal Processing Algorithms, Architectures, Arrangements, and Applications SPA 2007, pp. 129–134 (2007). ISSN 2326-0319. https://doi.org/10.1109/spa.2007.5903313

  8. Roberts, L.: Understanding the mel spectrogram - analytics vidhya - medium. In: Medium (2020). https://medium.com/analytics-vidhya/understanding-the-melspectrogram-fca2afa2ce53

  9. Davis, S., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoust. Speech Signal Process. 28, 357–366 (1980). https://doi.org/10.1109/tassp.1980.1163420. ISSN 0096-3518

    Article  Google Scholar 

  10. Kortas, M.: Sound-based bird classification. In: Medium (2020). https://towardsdatascience.com/sound-based-bird-classification-965d0ecacb2b

  11. https://www.deeplearningbook.org/contents/intro.htmls. Accessed 14 Nov 2020

  12. Saha, S.: A comprehensive guide to convolutional neural networks the ELI5 way. In: Medium (2018). ISSN 3211-6453. https://towardsdatascience.com/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way-3bd2b1164a53. Accessed 23 Dec 2019

  13. Nicholson, C.: A beginner’s guide to LSTMs and recurrent neural networks (2019). https://pathmind.com/wiki/lstm. Accessed 27 Dec 2019

  14. Nguyen, M.: Illustrated guide to LSTM’s and GRU’s: a step by step explanation. In: Medium (2019). https://towardsdatascience.com/illustrated-guideto-lstms-and-gru-s-a-step-by-step-explanation-44e9eb85bf21. Accessed 28 Dec 2019

  15. Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997). https://doi.org/10.1162/neco.1997.9.8.1735. ISSN 0899-7667

    Article  Google Scholar 

  16. Olah, C.: Understanding LSTM networks (2019). https://colah.github

  17. Kostadinov, S.: Understanding GRU networks. In: Medium. https://towardsdatascience.com/understanding-gru-networks-2ef37df6c9be. Accessed 28 Dec 2019

  18. Choi, K., et al.: Convolutional recurrent neural networks for music classification (2016). arXiv:1609.04243[cs.NE]

  19. Lasseck, M.: Bird species identification in soundscapes. In: CLEF (2019)

    Google Scholar 

  20. Kahl, S., et al.: Overview of BirdCLEF 2019: large-scale bird recognition in soundscapes. In: CLEF (2019)

    Google Scholar 

  21. Hiatt, S.: Avian vocalizations - report. In: Kaggle (2019).https://www.kaggle.com/samhiatt/avian-vocalizations-report

  22. Butterworth, S., et al.: On the theory of filter amplifiers. Wirel. Eng. 7(6), 536–541 (1930)

    Google Scholar 

  23. Lyons, J., et al.: “jameslyons/python_speech_features: release v0.6.1”. In: Zenodo. https://doi.org/10.5281/zenodo.3607820. Ph.D. Thesis. https://odr.chalmers.se/handle/20.500.12380/249467

  24. McFee, B., et al.: “librosa/librosa: 0.7.2” (2020). https://doi.org/10.5281/zenodo.3606573

  25. Hunter, J.D.: Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 9(3), 90–95 (2007). https://doi.org/10.1109/mcse.2007.55

    Article  Google Scholar 

  26. Abadi, M., et al.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available from tensorflow.org. https://www.tensorflow.org/

  27. Chollet, F., et al.: Keras (2015). https://keras.io

  28. Adams, S.: Audio-classification (2020). https://github.com/seth814/AudioClassiftion/tree/2f0032d81dcfa3d662cab1c1c4e7e30520f7edd6. Accessed 7 Jun 2020

  29. Doukkali, F.: Batch normalization in neural networks - towards data science. In: Medium (2019). https://towardsdatascience.com/batch-normalization-in-neural-networks-1ac91516821c

  30. Xie, J., Ding, C., Li, W., Cai, C.: Audio-only bird species automated identification method with limited training data based on multi-channel deep convolutional neural networks (2018). arXiv:abs/1803.01107

Download references

Acknowledgements

This work is financed by National Funds through the Portuguese funding agency, FCT - Fundação para a Ciência e a Tecnologia, within project UIDB/50014/2020.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elsa Ferreira Gomes .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Carvalho, S., Gomes, E.F. (2021). Automatic Identification of Bird Species from Audio. In: Nguyen, N.T., Chittayasothorn, S., Niyato, D., Trawiński, B. (eds) Intelligent Information and Database Systems. ACIIDS 2021. Lecture Notes in Computer Science(), vol 12672. Springer, Cham. https://doi.org/10.1007/978-3-030-73280-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-73280-6_4

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-73279-0

  • Online ISBN: 978-3-030-73280-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics