Authors:
Chirag Samal
1
;
Prince Yadav
2
;
Sakshi Singh
1
;
Satyanarayana Vollala
2
and
Amrita Mishra
3
Affiliations:
1
Dept. of Electronics and Communication Engineering, International Institute of Information Technology, Naya Raipur, India
;
2
Dept. of Computer Science and Engineering, International Institute of Information Technology, Naya Raipur, India
;
3
International Institute of Information Technology, Bangalore, India
Keyword(s):
Deep Learning, Bird Species Identification, Speech Recognition, Convolutional Neural Network.
Abstract:
Recent developments in machine and deep learning have made it possible to expand the realms of traditional audio pattern recognition to real-time and practical applications. This work proposes a novel framework for robust bird species identification using the neural network (RoBINN) based on their unique vocal signatures. To make the network robust and efficient, data augmentation is performed to create synthetic training samples for bird species with less available recordings. Further, inherent properties of audio signals are suitably leveraged via effective speech recognition-based feature engineering techniques to develop an end-to-end convolutional neural network (CNN). Additionally, the proposed model architecture for the CNN framework employs residual learning and attention mechanism to generate attention-aware features, which enhances the overall accuracy of birdcall identification. The proposed architecture employs an exhaustive dataset with 21375 recordings corresponding to
264 bird species. Experimental results validate the proposed bird species classification technique in terms of accuracy, F1-score, and binary cross-entropy loss.
(More)