Abstract
Monitoring and mitigating the ongoing decline of biodiversity is one of the most important global challenges that we face today. In this context, birds are important for many ecosystems, since they link habitats, resources, and biological processes and thus serve as important early warning indicators for the health of an ecosystem. State-of-the-art bird species recognition approaches typically rely on a closed-world assumption, i.e., deep learning models are trained once on an acquired dataset. However, changing environmental conditions may decrease the recognition quality. In this paper, we present a distributed system for bird species recognition based on active learning with human feedback to improve a deployed deep neural network model during operation. The system consists of three components: an embedded edge device for real-time bird species recognition and detection of misclassifications, a client-server web application for gathering human feedback and a backend component for training, evaluation, and deployment. Misclassifications during operation are detected based on a novel combination of reliability scores and an ensemble consisting of a bird detection and a bird species recognition model. Wrongly classified examples are sent to the human feedback component. Once sufficient feedback examples are labeled by a human expert, a new training process is triggered in the backend, and the trained deep learning model is optimized and deployed on the edge device. We performed several experiments to evaluate the quality of the bird species recognition model, the detection of misclassifications, and the overall system to demonstrate the feasibility of the proposed approach.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Brock, A., De, S., Smith, S.L., Simonyan, K.: High-performance large-scale image recognition without normalization. In: 38th International Conference on Machine Learning (ICML), Virtual Event. Proceedings of Machine Learning Research, vol. 139, pp. 1059–1071. PMLR (2021). http://proceedings.mlr.press/v139/brock21a.html
Conde, M.V., Choi, U.: Few-shot long-tailed bird audio recognition. In: Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy. CEUR Workshop Proceedings, vol. 3180, pp. 2036–2046. CEUR-WS.org (2022). http://ceur-ws.org/Vol-3180/paper-161.pdf
Disabato, S., Canonaco, G., Flikkema, P.G., Roveri, M., Alippi, C.: Birdsong detection at the edge with deep learning. In: IEEE International Conference on Smart Computing (SMARTCOMP), Irvine, CA, USA. pp. 9–16. IEEE (2021). https://doi.org/10.1109/SMARTCOMP52413.2021.00022
Gallacher, S., Wilson, D., Fairbrass, A., Turmukhambetov, D., Firman, M., Kreitmayer, S., Mac Aodha, O., Brostow, G., Jones, K.: Shazam for bats: Internet of things for continuous real-time biodiversity monitoring. IET Smart Cities 3(3), 171–183 (2021). https://doi.org/10.1049/smc2.12016
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016). https://doi.org/10.1109/CVPR.2016.90
Hendrycks, D., Gimpel, K.: A baseline for detecting misclassified and out-of-distribution examples in neural networks. In: 5th International Conference on Learning Representations (ICLR), Toulon, France, Conference Track Proceedings. OpenReview.net (2017). https://openreview.net/forum?id=Hkg4TI9xl
Henkel, C., Pfeiffer, P., Singer, P.: Recognizing bird species in diverse soundscapes under weak supervision. In: Working Notes of CLEF 2021 - Conference and Labs of the Evaluation Forum, Bucharest, Romania. CEUR Workshop Proceedings, vol. 2936, pp. 1579–1586. CEUR-WS.org (2021). http://ceur-ws.org/Vol-2936/paper-134.pdf
Hill, A.P., Prince, P., Snaddon, J.L., Doncaster, C.P., Rogers, A.: Audiomoth: A low-cost acoustic device for monitoring biodiversity and the environment. HardwareX 6, e00073 (2019). https://doi.org/10.1016/j.ohx.2019.e00073
Höchst, J., et al.: Bird@edge: Bird species recognition at the edge. In: Networked Systems - 10th International Conference (NETYS), Virtual Event, Proceedings. Lecture Notes in Computer Science, vol. 13464, pp. 69–86. Springer (2022). https://doi.org/10.1007/978-3-031-17436-0_6
iNaturalist: A community for naturalists, https://www.inaturalist.org/
Kahl, S., et al.: Overview of BirdCLEF 2020: Bird sound recognition in complex acoustic environments. In: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece. CEUR Workshop Proceedings, vol. 2696. CEUR-WS.org (2020). http://ceur-ws.org/Vol-2696/paper_262.pdf
Kahl, S., Wood, C.M., Eibl, M., Klinck, H.: BirdNET: a deep learning solution for avian diversity monitoring. Ecol. Inf. 61, 101236 (2021). https://doi.org/10.1016/j.ecoinf.2021.101236
Kemker, R., McClure, M., Abitino, A., Hayes, T., Kanan, C.: Measuring catastrophic forgetting in neural networks. In: Proceedings of the AAAI conference on artificial intelligence. vol. 32 (2018). https://doi.org/10.1609/aaai.v32i1.11651
Kholghi, M., Phillips, Y., Towsey, M., Sitbon, L., Roe, P.: Active learning for classifying long-duration audio recordings of the environment. Meth. Ecol. Evol. 9(9), 1948–1958 (2018). https://doi.org/10.1111/2041-210X.13042
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), Venice, Italy. pp. 2999–3007. IEEE Computer Society (2017). https://doi.org/10.1109/ICCV.2017.324
Lostanlen, V., Salamon, J., Farnsworth, A., Kelling, S., Bello, J.P.: Birdvox-full-night: A dataset and benchmark for avian flight call detection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Calgary, AB, Canada. pp. 266–270. IEEE (2018). https://doi.org/10.1109/ICASSP.2018.8461410
Martynov, E., Uematsu, Y.: Dealing with class imbalance in bird sound classification. In: Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy. CEUR Workshop Proceedings, vol. 3180, pp. 2151–2158. CEUR-WS.org (2022), http://ceur-ws.org/Vol-3180/paper-170.pdf
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: The sequential learning problem. In: Psychology of learning and motivation, vol. 24, pp. 109–165. Elsevier (1989)
Michez, A., Broset, S., Lejeune, P.: Ears in the sky: potential of drones for the bioacoustic monitoring of birds and bats. Drones 5(1), 9 (2021). https://doi.org/10.3390/drones5010009
Miyaguchi, A., Yu, J., Cheungvivatpant, B., Dudley, D., Swain, A.: Motif mining and unsupervised representation learning for birdCLEF 2022. In: Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy. CEUR Workshop Proceedings, vol. 3180, pp. 2159–2167. CEUR-WS.org (2022), http://ceur-ws.org/Vol-3180/paper-171.pdf
Mühling, M., Franz, J., Korfhage, N., Freisleben, B.: Bird species recognition via neural architecture search. In: Working Notes of CLEF 2020 - Conference and Labs of the Evaluation Forum, Thessaloniki, Greece. CEUR Workshop Proceedings, vol. 2696. CEUR-WS.org (2020). http://ceur-ws.org/Vol-2696/paper_188.pdf
Mukhoti, J., Kulharia, V., Sanyal, A., Golodetz, S., Torr, P.H.S., Dokania, P.K.: Calibrating deep neural networks using focal loss. In: Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems (NeurIPS), Virtual Event (2020), https://proceedings.neurips.cc/paper/2020/hash/aeb7b30ef1d024a76f21a1d40e30c302-Abstract.html
Mundt, M., Hong, Y., Pliushch, I., Ramesh, V.: A wholistic view of continual learning with deep neural networks: forgotten lessons and the bridge to active and open world learning. Neural Netw. 160, 306–336 (2023). https://doi.org/10.1016/j.neunet.2023.01.014
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Qian, K., Zhang, Z., Baird, A., Schuller, B.: Active learning for bird sound classification via a kernel-based extreme learning machine. J. Acoust. Soc. Am. 142(4), 1796–1804 (2017). https://doi.org/10.1121/1.5004570
Qian, K., Zhang, Z., Baird, A., Schuller, B.: Active learning for bird sounds classification. Acta Acustica united with Acustica 103, 361–341 (04 2017). https://doi.org/10.3813/AAA.919064
Qiu, X., Miikkulainen, R.: Detecting misclassification errors in neural networks with a gaussian process model. In: Proceedings of the AAAI Conference on Artificial Intelligence. pp. 8017–8027. AAAI Press (2022). https://ojs.aaai.org/index.php/AAAI/article/view/20773
Ren, P., et al.: A survey of deep active learning. ACM Comput. Surv. 54(9), 1–40 (2021). https://doi.org/10.1145/3472291
Russakovsky, O., et al.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
Sampathkumar, A., Kowerko, D.: TUC media computing at birdclef 2022: Strategies in identifying bird sounds in a complex acoustic environments. In: Proceedings of the Working Notes of CLEF 2022 - Conference and Labs of the Evaluation Forum, Bologna, Italy. CEUR Workshop Proceedings, vol. 3180, pp. 2189–2198. CEUR-WS.org (2022). http://ceur-ws.org/Vol-3180/paper-174.pdf
Shamon, H., et al.: Using ecoacoustics metrices to track grassland bird richness across landscape gradients. Ecol. Indic. 120, 106928 (2021). https://doi.org/10.1016/j.ecolind.2020.106928
Silva, D.F., Yeh, C.M., Zhu, Y., Batista, G.E., Keogh, E.J.: Fast similarity matrix profile for music analysis and exploration. IEEE Trans. Multim. 21(1), 29–38 (2019). https://doi.org/10.1109/TMM.2018.2849563
Stowell, D., Plumbley, M.: An open dataset for research on audio field recording archives: freefield1010. In: Audio Engineering Society Conference: 53rd International Conference: Semantic Audio (2014). http://www.aes.org/e-lib/browse.cfm?elib=17095
Tan, M., Le, Q.V.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: Proceedings of the 36th International Conference on Machine Learning (ICML), Long Beach, California, USA. Proceedings of Machine Learning Research, vol. 97, pp. 6105–6114. PMLR (2019). arxiv:1905.11946
Wang, Y., Mendez Mendez, A.E., Cartwright, M., Bello, J.P.: Active learning for efficient audio annotation and classification with a large amount of unlabeled data. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). pp. 880–884 (2019). https://doi.org/10.1109/ICASSP.2019.8683063
Xeno-canto: Sharing bird sounds from around the world, https://www.xeno-canto.org/
Zhang, H., et al.: ResNeSt: Split-attention networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2022, New Orleans, LA, USA. pp. 2735–2745. IEEE (2022). https://doi.org/10.1109/CVPRW56347.2022.00309
Zhang, H., Cissé, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: Beyond empirical risk minimization. In: 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada. OpenReview.net (2018). https://openreview.net/forum?id=r1Ddp1-Rb
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: 5th International Conference on Learning Representations, (ICLR), Toulon, France, Conference Track Proceedings (2017). https://openreview.net/forum?id=r1Ue8Hcxg
Zualkernan, I., Judas, J., Mahbub, T., Bhagwagar, A., Chand, P.: An AIoT system for bat species classification. In: IEEE International Conference on Internet of Things and Intelligence System (IoTaIS). pp. 155–160 (2021). https://doi.org/10.1109/IoTaIS50849.2021.9359704
Acknowledgments
This work is funded by the Hessian State Ministry for Higher Education, Research and the Arts (HMWK) (LOEWE Natur 4.0, LOEWE emergenCITY, and hessian.AI Connectom AI4Birds, AI4BirdsDemo), and the German Research Foundation (DFG, Project 210487104 - SFB 1053 MAKI).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bellafkir, H., Vogelbacher, M., Schneider, D., Mühling, M., Korfhage, N., Freisleben, B. (2023). Edge-Based Bird Species Recognition via Active Learning. In: Mohaisen, D., Wies, T. (eds) Networked Systems. NETYS 2023. Lecture Notes in Computer Science, vol 14067. Springer, Cham. https://doi.org/10.1007/978-3-031-37765-5_2
Download citation
DOI: https://doi.org/10.1007/978-3-031-37765-5_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-37764-8
Online ISBN: 978-3-031-37765-5
eBook Packages: Computer ScienceComputer Science (R0)