Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Murugaiya, Ramashini; Abas, Pg Emeroylariffion; De Silva, Liyanage Chandratilak

doi:10.1007/s11633-022-1318-3

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Research Article
Published: 21 January 2022

Volume 19, pages 52–62, (2022)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

213 Accesses
5 Citations
1 Altmetric
Explore all metrics

Abstract

Identification of bird species from their sounds has become an important area in biodiversity-related research due to the relative ease of capturing bird sounds in the commonly challenging habitat. Audio features have a massive impact on the classification task since they are the fundamental elements used to differentiate classes. As such, the extraction of informative properties of the data is a crucial stage of any classification-based application. Therefore, it is vital to identify the most significant feature to represent the actual bird sounds. In this paper, we propose a novel feature that can advance classification accuracy with modified features, which are most suitable for classifying birds from its audio sounds. Modified Gammatone frequency cepstral coefficient (GTCC) features have been extracted with their frequency banks adjusted to suit bird sounds. The features are then used to train and test a support vector machine (SVM) classifier. It has been shown that the modified GTCC features are able to give 86% accuracy with twenty Bornean birds. Furthermore, in this paper, we are proposing a novel probability enhanced entropy (PEE) feature, which, when combined with the modified GTCC features, is able to improve accuracy further to 89.5%. These results are significant as the relatively low-resource intensive SVM with the proposed modified GTCC, and the proposed novel PEE feature can be implemented in a real-time system to assist researchers, scientists, conservationists, and even eco-tourists in identifying bird species in the dense forest.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Exploration of Acoustic and Temporal Features for the Multiclass Classification of Bird Species

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Article 02 May 2023

Milad Latifi, Sima Fakheran, … Parnian Mahmoudzadeh Tussi

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

References

J. Xie, K. Hu, M. Y. Zhu, J. H. Yu, Q. B. Zhu. Investigation of different CNN-based models for improved bird sound classification. IEEE Access, vol.7, pp. 175353–175361, 2019. DOI: https://doi.org/10.1109/ACCESS.2019.2957572.
Article Google Scholar
Y. Qiao, K. Qian, Z. P. Zhao. Learning higher representations from bioacoustics: A sequence-to-sequence deep learning approach for bird sound classification. In Proceedings of the 27th International Conference on Neural Information Processing, Springer, Bangkok, Thailand, pp. 130–138, 2020. DOI: https://doi.org/10.1007/978-3-030-63823-8_16.
Chapter Google Scholar
K. Qian, Z. X. Zhang, A. Baird, B. Schuller. Active learning for bird sound classification via a kernel-based extreme learning machine. The Journal of the Acoustical Society of America, vol. 142, no. 4, pp. 1796–1804, 2017. DOI: https://doi.org/10.1121/1.5004570.
Article Google Scholar
M. Ramashini, P. E. Abas, U. Grafe, L. C. De Silva. Bird sounds classification using linear discriminant analysis. In Proceedings of the 4th International Conference and Workshops on Recent Advances and Innovations in Engineering, IEEE, Kedah, Malaysia, 2019. DOI: https://doi.org/10.1109/ICRAIE47735.2019.9037645.
Google Scholar
G. Sharma, K. Umapathy, S. Krishnan. Trends in audio signal feature extraction methods. Applied Acoustics, vol. 158, Article number 107020, 2020. DOI: https://doi.org/10.1016/j.apacoust.2019.107020.
O. Kücüktopcu, E. Masazade, C. Ünsalan, P. K. Varshney. A real-time bird sound recognition system using a low-cost microcontroller. Applied Acoustics, vol.148, pp. 194–201, 2019. DOI: https://doi.org/10.1016/j.apacoust.2018.12.028.
Article Google Scholar
J. Ludeña-Choez, R. Quispe-Soncco, A. Gallardo-Antolín. Bird sound spectrogram decomposition through non-negative matrix factorization for the acoustic classification of bird species. PLoS One, vol.12, no. 6, Article number e0179403, 2017. DOI: https://doi.org/10.1371/journal.pone.0179403.
Y. R. Leng, H. Dat Tran. Multi-label bird classification using an ensemble classifier with simple features. In Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, IEEE, Siem Reap, Cambodia, 2014. DOI: https://doi.org/10.1109/APSIPA.2014.7041649.
Google Scholar
S. Fagerlund, U. K. Laine. New parametric representations of bird sounds for automatic classification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Florence, Italy, pp. 8247–8251, 2014. DOI: https://doi.org/10.1109/ICASSP.2014.6855209.
Google Scholar
Z. X. Chen, R. C. Maher. Semi-automatic classification of bird vocalizations using spectral peak tracks. The Journal of the Acoustical Society of America, vol.120, no. 5, pp. 2974–2984, 2006. DOI: https://doi.org/10.1121/1.2345831.
Article Google Scholar
J. A. Kogan, D. Margoliash. Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study. The Journal of the Acoustical Society of America, vol. 103, no. 4, pp. 2185–2196, 1998. DOI: https://doi.org/10.1121/1.421364.
Article Google Scholar
C. H. Lee, C. C. Lien, R. Z. Huang. Automatic recognition of birdsongs using mel-frequency cepstral coefficients and vector quantization. In Proceedings of International Muti Conference of Engineering and Computer Scientists, Hong Kong, China, pp. 331–335, 2006.
H. Tyagi, R. M. Hegde, H. A. Murthy, A. Prabhakar. Automatic identification of bird calls using spectral ensemble average voice prints. In Proceedings of the 14th European Signal Processing Conference, IEEE, Florence, Italy, pp. 1–5, 2006.
Google Scholar
D. Stowell, M. D. Plumbley. Audio-only bird classification using unsupervised feature learning. In Proceedings of Working Notes of CLEF 2014 Conference, Sheffield, UK, pp. 673–684, 2014.
A. Digby, M. Towsey, B. D. Bell, P. D. Teal. A practical comparison of manual and autonomous methods for acoustic monitoring. Methods in Ecology and Evolution, vol.4, no. 7, pp. 675–683, 2013. DOI: https://doi.org/10.1111/2041-210X.12060.
Article Google Scholar
M. Graciarena, M. Delplanche, E. Shriberg, A. Stolcke. Bird species recognition combining acoustic and sequence modeling. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, Prague, Czech Republic, pp. 341–344, 2011. DOI: https://doi.org/10.1109/ICAS-SP.2011.5946410.
J. S. Ulloa, A. Gasc, P. Gaucher, T. Aubin, M. Réjou-Méchain, J. Sueur. Screening large audio datasets to determine the time and space distribution of Screaming Piha birds in a tropical forest. Ecological Informatics, vol.31, pp. 91–99, 2016. DOI: https://doi.org/10.1016/j.ecoinf.2015.11.012.
Article Google Scholar
S. Bastas, M. W. Majid, G. Mirzaei, J. Ross, M. M. Jamali, P. V. Gorsevski, J. Frizado, V. P. Bingman. A novel feature extraction algorithm for classification of bird flight calls. In Proceedings of IEEE International Symposium on Circuits and Systems, IEEE, Seoul, Korea, pp. 1676–1679, 2012. DOI: https://doi.org/10.1109/ISCAS.2012.6271580.
Google Scholar
T. D. Ganchev, O. Jahn, M. I. Marques, J. M. De Figueiredo, K. L. Schuchmann. Automated acoustic detection of Vanellus chilensis lampronotus. Expert Systems with Applications, vol.42, no. 15–16, pp.6098–6111, 2015. DOI: https://doi.org/10.1016/j.eswa.2015.03.036.
Article Google Scholar
M. Lasseck. Bird song classification in field recordings: Winning solution for NIPS4B 2013 competition. In Proceedings of ‘Neural Information Scaled for Bioacoustics’ Joint to NIPS, Nevada, USA, pp. 176–181, 2013.
M. Lasseck. Large-scale identification of birds in audio recordings notes on the winning solution of the LifeCLEF 2014 Bird Task. In Proceedings of CEUR Workshop, vol.1180, pp. 643–653, 2014.
Google Scholar
J. Salamon, J. P. Bello, A. Farnsworth, S. Kelling. Fusing shallow and deep learning for Bioacoustic bird species classification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, New Orleans, USA, pp. 141–145, 2017. DOI: https://doi.org/10.1109/ICASSP.2017.7952134.
Google Scholar
E. Znidersic. Audio-based bird species identification with deep convolutional neural networks. The Journal of the Acoustical Society of America, vol.5, no. 4, pp. 4640–4650, 2017.
Google Scholar
J. Xie, K. Hu, M. Y. Zhu, Y. Guo. Bioacoustic signal classification in continuous recordings: Syllable-segmentation vs sliding-window. Expert Systems with Applications, vol.152, Article number 113390, 2020. DOI: https://doi.org/10.1016/j.eswa.2020.113390.
T. Kemp, M. Schmidt, M. Westphal, A. Waibel. Strategies for automatic segmentation of audio data. In Proceedings of IEEE International Conference on Acoustics, Speech, and Signal, IEEE, Istanbul, Turkey, pp. 1423–1426, 2000. DOI: https://doi.org/10.1109/ICASSP.2000.861862.
Google Scholar
M. Ramashini, P. E. Abas, L. C. De Silva. A novel approach of audio based feature optimisation for bird classification. Pertanika Journal of Science and Technology, vol.29, no. 4, pp. 2383–2407, 2021. DOI: https://doi.org/10.47836/pjst.29.4.08.
Article Google Scholar
H. P. Wang, C. L. Zhang. The application of Gammatone frequency cepstral coefficients for forensic voice comparison under noisy conditions. Australian Journal of Forensic Sciences, vol.52, no.5, pp.553–568, 2020. DOI: https://doi.org/10.1080/00450618.2019.1584830.
Article Google Scholar
R. Fathima, P. E. Raseena. Gammatone cepstral coefficient for speaker identification. International Journal of Scientific & Engineering Research, vol.4, no. 10, pp. 795–798, 2013.
Google Scholar
Y. Shao, Z. Z. Jin, D. L. Wang, S. Srinivasan. An auditory-based feature for robust speech recognition. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal, IEEE, Taipei, China, pp. 4625–4628, 2009. DOI: https://doi.org/10.1109/ICASSP.2009.4960661.
Google Scholar
X. J. Zhao, D. L. Wang. Analyzing noise robustness of MFCC and GFCC features in speaker identification. In Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, Vancouver, Canada, pp. 7204–7208, 2013. DOI: https://doi.org/10.1109/ICASSP.2013.6639061.
Google Scholar
X. Valero, F. Alías. Gammatone cepstral coefficients: Biologically inspired features for non-speech audio classification. IEEE Transactions on Multimedia, vol. 14, no. 6, pp. 1684–1689, 2012. DOI: https://doi.org/10.1109/TMM.2012.2199972.
Article Google Scholar
S. Singh, R. Kumar. Histopathological image analysis for breast cancer detection using cubic SVM. In Proceedings of the 7th International Conference on Signal Processing and Integrated Networks, IEEE, Noida, India, pp. 498–503, 2020. DOI: https://doi.org/10.1109/SPIN48934.2020.9071218.
Google Scholar
R. Gholami, N. Fakhari. Support vector machine: Principles, parameters, and applications. Handbook of Neural Computation, P. Samui, S. Sekhar, V. E. Balas, Eds., Amsterdam, Netherlands: Elsevier, pp. 515–535, 2017. DOI: https://doi.org/10.1016/B978-0-12-811318-9.00027-2.
Chapter Google Scholar
M. Ramashini, P. E. Abas, K. Mohanchandra, L. C. De Silva. Robust cepstral feature for bird sound classification. International Journal of Electrical and Computer Engineering, vol.12, Article number 2, 2022. DOI: https://doi.org/10.11591/IJECE.V12I2.pp%25p.
S. Nowicki, P. Marler. How do birds sing? Music Perception, vol.5, no.4, pp.391–426, 1998. DOI: https://doi.org/10.2307/40285408.
Article Google Scholar
F. Goller, Riede. Integrative physiology of fundamental frequency control in birds. Journal of Physiology-Paris, vol.107, no. 3, pp. 230–242, 2013. DOI: https://doi.org/10.1016/j.jphysparis.2012.11.001.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Integrated Technologies, Universiti Brunei Darussalam, Bandar Seri Begawan, BE1410, Brunei Darussalam
Ramashini Murugaiya, Pg Emeroylariffion Abas & Liyanage Chandratilak De Silva
Department of Computer Science and Informatics, Uva Wellassa University, Badulla, 90000, Sri Lanka
Ramashini Murugaiya

Authors

Ramashini Murugaiya
View author publications
You can also search for this author in PubMed Google Scholar
Pg Emeroylariffion Abas
View author publications
You can also search for this author in PubMed Google Scholar
Liyanage Chandratilak De Silva
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ramashini Murugaiya.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Ramashini Murugaiya received the B.Tech. degree in information technology from Anna University, India in 2012, the M. Eng. degree in computer and communication engineering from Anna University, India in 2014, and the Ph.D. degree in systems engineering from Universiti Brunei Darussalam, Brunei Darussalam in 2021. She is now a lecturer at Uva Wellassa University, Sri Lanka.

Her research interests include audio signal processing, biomedical signal processing, bio and environmental acoustics. E-mail: ramashini@uwu.ac.lk (Corresponding author) ORCID iD: 0000-0001-5651-4674

Pg Emeroylariffion Abas received the B. Eng. degree in information systems engineering from Imperial College, UK in 2001, and received the Ph.D. degree in communication systems from the same institution in 2005. He is now working as an assistant professor in system engineering in Faculty of Integrated Technologies, Universiti Brunei Darussalam, Brunei Darussalam.

His research interests include data analysis, security of infocommunication systems and design of photonic crystal fiber in fiber optics communication. E-mail: emeroylariffion.abas@ubd.edu.bn ORCID iD: 0000-0002-7006-3838

Liyanage Chandratilak De Silva received the B. Sc. (Hons) degree from the University of Moratuwa, Sri Lanka in 1985, the M. Phil. degree in electrical and computer engineering from The Open University of Sri Lanka, Sri Lanka in 1989, the M. Eng. degree in information engineering and Ph. D degree in electrical engineering from University of Tokyo, Japan in 1992 and 1995 respectively. He was with the University of Tokyo, Japan from 1989 to 1995. From April 1995 to March 1997, he pursued his postdoctoral research as a post-doctoral researcher at ATR (Advanced Telecommunication Research) Laboratories, Japan. In March 1997, he joined National University of Singapore as a lecturer, where he was an assistant professor till June 2003. He was with the Massey University, New Zealand from 2003 to 2007 as a senior lecturer. He joined University of Brunei Darussalam in 2007 as an associate professor. Currently, he is a professor of engineering and the dean of Faculty of Integrated Technologies, University of Brunei Darussalam, Brunei Darussalam. He has over 30 years of postgraduate experience in various levels in his career in the Asia Pacific region. He has published over 190 technical papers in these areas in international conferences, journals and Japanese national conventions and holds one Japanese national patent, which was successfully sold to Sony Corporation Japan for commercial utilization. He also holds 3 U. S., and 1 Brunei patents with several patents pending. His works have been cited as one of the pioneering works in the bimodal (audio and video signal based) emotion recognition by many researchers. His papers so far have been cited by more than 3500 times (according to scholar.google.com) with an H-in-dex of 25. He is a senior member of IEEE (USA).

His research interests include internet of things, image and speech signal processing, information theory, computer vision, data analytics pattern recognition and understanding, smart homes and smart sensors, multimedia signal processing, digital electronics. E-mail: liyanage.silva@ubd.edu.bn ORCID iD: 0000-0001-7128-5945

Rights and permissions

Reprints and permissions

About this article

Cite this article

Murugaiya, R., Abas, P.E. & De Silva, L.C. Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification. Mach. Intell. Res. 19, 52–62 (2022). https://doi.org/10.1007/s11633-022-1318-3

Download citation

Received: 28 June 2021
Accepted: 28 October 2021
Published: 21 January 2022
Issue Date: February 2022
DOI: https://doi.org/10.1007/s11633-022-1318-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Abstract

Access this article

Similar content being viewed by others

An Exploration of Acoustic and Temporal Features for the Multiclass Classification of Bird Species

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Probability Enhanced Entropy (PEE) Novel Feature for Improved Bird Sound Classification

Abstract

Access this article

Similar content being viewed by others

An Exploration of Acoustic and Temporal Features for the Multiclass Classification of Bird Species

Soundscape analysis using eco-acoustic indices for the birds biodiversity assessment in urban parks (case study: Isfahan City, Iran)

Insights from Deep Learning in Feature Extraction for Non-supervised Multi-species Identification in Soundscapes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation