Abstract
Emergency Siren Recognition (ESR) is an important issue for automotive safety. We are interested in the early recognition of ambulance sirens in urban scenarios, where noise can be produced by a wide variety of sources and represents an impediment to the perception of alarm sounds by drivers. In this paper, we propose a deep convolutional neural network based on the U-Net encoding path for the ESR task. To overcome the problem of audio acquisition, an algorithm has been implemented to generate a synthetic dataset that reproduces the sound of a siren in multiple urban traffic contexts. We perform emergency sound recognition to identify the presence of the alerting sound using spectrogram-like features. Our experimental evaluations demonstrate that our ESR approach has achieved excellent performance both in mono-scenarios and multi-scenarios at very low SNRs, also in conditions unseen during training thanks to a large amount of training data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Pepe, G.: Detecting road surface wetness using microphones and convolutional neural networks. In: Audio Engineering Society Convention 146. Audio Engineering Society (2019)
Jackson, L.: Emergency vehicle detection device. U.S. Patent No. 5,235,329 (1993)
Fazenda, B.: Acoustic based safety emergency vehicle detection for intelligent transport systems. In: 2009 ICCAS-SICE. IEEE (2009)
Brill, W.E.: Emergency vehicle detection system. U.S. Patent No. 6,362,749 (2002)
Meucci, F.: A real-time siren detector to improve safety of guide in traffic environment. In: 2008 16th European Signal Processing Conference. IEEE (2008)
Liaw, J.J.: Recognition of the ambulance siren sound in Taiwan by the Longest Common Subsequence. In: 2013 IEEE International Conference on Systems, Man, and Cybernetics. IEEE (2013)
Tran, V.T., Yan, Y.C., Tsa, W.H.: Detection of ambulance and fire truck siren sounds using neural networks. ARPN J. Eng. Appl. Sci. 12(5), 9–14 (2017)
Beritelli, F.: An automatic emergency signal recognition system for the hearing impaired. In: 2006 IEEE 12th Digital Signal Processing Workshop & 4th IEEE Signal Processing Education Workshop. IEEE (2006)
Salamon, J., Christopher, J., Juan, P.B.: A dataset and taxonomy for urban sound research. In: Proceedings of the 22nd ACM International Conference on Multimedia (2014)
Ellis, D.P.W.: Detecting alarm sounds, pp. 59–62 (2001)
Carmel, D., Ariel, Y., Yair, M.: Detection of alarm sounds in noisy environments. In: 2017 25th European Signal Processing Conference (EUSIPCO). IEEE (2017)
Padhy, S.: Emergency signal classification for the hearing impaired using multi-channel convolutional neural network architecture. In: 2019 IEEE Conference on Information and Communication Technology. IEEE (2019)
Zhang, H.Y.: mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Tran, V.T., Tsai, W.H.: Acoustic-based emergency vehicle detection using convolutional neural networks. IEEE Access 8, 75702–75713 (2020)
Marchegiani, L., Paul, N.: Listening for sirens: locating and classifying acoustic alarms in city scenes. arXiv preprint arXiv:1810.04989 (2018)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Ellis, D.P.W.: Gammatone-like spectrograms (2009). http://www.ee.columbia.edu/~dpwe/resources/matlab/gammatonegram
Ministero dei Trasporti: D.M. 17.10.1980 (G.U. n.310 del 12.11.1980)
Smith, J.: Doppler simulation and the Leslie. In: Proceedings of the International Conference on Digital Audio Effects, Hamburg (2002)
McFee, B.: librosa: Audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, vol. 8 (2015)
Kingma, D.P., Jimmy, B.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Font, F., Gerard, R., Xavier, S.: Freesound technical demo. In: Proceedings of the 21st ACM International Conference on Multimedia (2013)
Acknowledgement
This work is supported by Marche Region in implementation of the financial programme POR MARCHE FESR 2014–2020, project “Miracle” (Marche Innovation and Research fAcilities for Connected and sustainable Living Environments), CUP B28I19000330007.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Cantarini, M., Serafini, L., Gabrielli, L., Principi, E., Squartini, S. (2020). Emergency Siren Recognition in Urban Scenarios: Synthetic Dataset and Deep Learning Models. In: Huang, DS., Bevilacqua, V., Hussain, A. (eds) Intelligent Computing Theories and Application. ICIC 2020. Lecture Notes in Computer Science(), vol 12463. Springer, Cham. https://doi.org/10.1007/978-3-030-60799-9_18
Download citation
DOI: https://doi.org/10.1007/978-3-030-60799-9_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-60798-2
Online ISBN: 978-3-030-60799-9
eBook Packages: Computer ScienceComputer Science (R0)