Skip to main content
Log in

Sound based alarming based video surveillance system design

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Modern video surveillance systems consist of a network of many video cameras. Constantly video camera systems are being installed for security reasons in prisons, elevators, automatic teller machines and more. Usually, video cameras are connected to a display screen from which security personnel monitor suspicious activity. As security personnel monitor multiple locations simultaneously, this manual task is labor intensive and inefficient. These camera systems have some other drawbacks such that they have limited coverage and security personnel cannot see all the points even though they are looking at the camera. Therefore, most of the time, some other sensors should accompany to video cameras. Although audio surveillance is in its early stage, there has been considerable amount of work in this area in the last decade. On the other hand, currently, there are no practical audio surveillance solutions for security on the market. In this paper, audio surveillance is integrated to current video surveillance systems using deep learning. We develop a complete system and show a working prototype. It is encouraging to see that the system is good enough and can be used in real life.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18

Similar content being viewed by others

References

  1. Ahmed T, Uppal M, Muhammad A (2013) Improving efficiency and realibility of gunshot detection systems. IEEE, ICASSP

    Google Scholar 

  2. Anrew NG (n.d.) Introduction to machine learning in production, deeplearning.AI, Coursera

  3. Arslan Y, Canbolat H (2017) A sound database development for environmental sound recognition. Signal processing and communications applications conference (SIU), 25th

  4. Arslan Y, Tanıs A, Canbolat H (2017) A Relational Database Model and Tools for Environmental Sound Recognition. ASTES Journal 2(6):145–150

    Article  Google Scholar 

  5. Atrey PK, Maddage NC, Kankanhalli MS (2006) Audio based event detection for multimedia surveillance. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing (ICASSP ‘06), vol. 5. Toulouse, France, pp 813–816

    Google Scholar 

  6. Bardeli R, Wolff D, Kurth F, Koch M, Tauchert KH, Frommolt KH (2010) Detecting bird sounds in a complex acoustic environment and application to bioacoustic monitoring. Pattern Recogn Lett 31(12):1524–1534

    Article  Google Scholar 

  7. Bramberger M, Doblander A, Maier A, Rinner B, Schwabach H (2006) Distributed embedded smart cameras for surveillance applications. Computer 39(2):68–75. https://doi.org/10.1109/MC.2006.55

    Article  Google Scholar 

  8. Cakir E., Virtanen T, Convolutional recurrent neural networks for rare sound event detection. Dedection and Classification of Acoustic Scenes and Events, 2017.

  9. Chen J, Kam AH, Zhang J, Liu N, Shue L (2005) Bathroom activity monitoring based on sound. In: Pervasive computing. Springer, Berlin Heidelberg, pp 47–61

    Chapter  Google Scholar 

  10. Cho H, Baek Y, Kyung CM (2014) Wireless video sensor network platform and its application for public safety. In: Proc. IEEE Int. Conf. HPCC, CSS, ICESS, Aug. 2014, pp 743–746

    Google Scholar 

  11. Chu S, Narayanan S, Kuo CJ, Mataric MJ (2006) Where am I? Scene recognition for mobile robots using audio features. In: IntConf on Multimedia and Expo. IEEE, pp 885–888

    Google Scholar 

  12. Cisco IP (2009) Video Surveillance Design Guide

    Google Scholar 

  13. Cucchiara R, Prati A, Vezzani R (2011) Designing video surveillance systems as services. In: Proceedings of the 2nd workshop on video surveillance projects in Italy (VISIT ‘11)

  14. Dang A, Vu TH, Wang J. C, Deep Learning for DCASE2017 Challenge, Detection and Classification of Acoustic Scenes and Events 2017

  15. Dufaux A (n.d.) Detection and recognition of Impulsive Sound Signals. Ph.D. Thesis

  16. Foggia P, Saggese A, Strisciuglio N, Vento M, Petkov N (2015) Car crashes detection by audio analysis in crowded roads. In: Advanced video and signal based surveillance (AVSS), 2015 12th IEEE international conference on, pp 1-6

  17. Foggia P, Petkov N, Saggese A, Strisciuglio N, Vento M (2015) Reliable detection of audio events in highly noisy environments. Pattern Recogn Lett 65:22–28

    Article  Google Scholar 

  18. Foggia P, Saggese A, Strisciuglio N, Vento M, Vigilante V (2019) Detecting sounds of interest in roads with deep networks, In book: Image Analysis and Processing – ICIAP 2019 , September 2019. https://doi.org/10.1007/978-3-030-30645-8_53

  19. Gade R, Moeslund TB (2014) Thermal cameras and applications: a survey. Mach Vis Appl 25:245–262

    Article  Google Scholar 

  20. Goodfellow I, Bengio Y, Courville A (2016) Deep learning. MIT Press, Cambridge

    MATH  Google Scholar 

  21. Hossain AA (2014) Framework for a cloud-based multimedia surveillance system. Int J Distrib Sensor Netw 2014

  22. https://ffmpeg.org/. last accessed on 20.02.2021

  23. https://www.ifsecglobal.com/global/video-surveillance-report-2017/. Last accessed 20.01.2021

  24. Kotus J, Lopatka K, Czyzewski A (2014) Detection and localization of selected acoustic events in acoustic field for smart surveillance applications. Multimed Tools Appl 68:5–21

    Article  Google Scholar 

  25. Kotus J, Łopatka K, Czyżewski A et al (2016) Processing of acoustical data in a multimodal bank operating room surveillance system. Multimed Tools Appl. https://doi.org/10.1007/s11042-014-2264-z

  26. Lim H, Park J, Lee K, Han Y, Rare sound event detection using 1D convolutional recurrent neural networks, detection and classification of acoustic scenes and events 2017.

    Google Scholar 

  27. Lin CF, Yuan SM, Leu MC, Tsai CT (2012) A framework for scalable cloud video recorder system in surveillance environment. In: Proceedings of the 9th international conference on Ubiquitous Intelligence & Computing and 9th international conference on Autonomic & Trusted Computing (UIC/ATC ‘12), pp 655–660

  28. Lo BPL, Sun J, Velastin SA (2003) Fusing visual and audio information in a distributed intelligent surveillance system for public transport systems. Acta Automat Sin 29:393–407

    Google Scholar 

  29. Lopatka K, Kotus J, Czyzewski A (2016) Detection, classification and localization of acoustic events in the presence of background noise for acoustic surveillance of hazardous situations. Multimed Tools Appl 75(17):10407–10439

    Article  Google Scholar 

  30. Mesaros A, Heittola T, Virtanen T (2016) Metrics for polyphonic sound event detection. Appl Sci 6(6):162

    Article  Google Scholar 

  31. Mesaros A, Heittola T, Diment A, Elizalde B, Shah A, Vincent E, Raj B, Virtanen T (2017) DCASE 2017 challenge setup: tasks, datasets and baseline system. In: Proceedings of the detection and classification of acoustic scenes and events 2017 workshop (DCASE2017), November 2017

  32. Park JS, Kim SH (2020) Sound learning–based event detection for acoustic surveillance sensors. Multimedia Tools Appl 79:16127–16139. https://doi.org/10.1007/s11042-019-7547-y

    Article  Google Scholar 

  33. Radhakrishnan R, Divakaran A (2005) Systematic acquisition of audio classes for elevator surveillance. In: Image and video communications and processing 2005, vol 5685 of proceedings of SPIE, pp 64–71

  34. Raty TD (2010) Survey on contemporary remote surveillance systems for public safety. IEEE Trans Syst Man Cybern Part C Appl Rev 99:1–23

    Google Scholar 

  35. Rodríguez-Silva DA, Adkinson-Orellana L, González-Castano FJ, Armino-Franco I, González-Martinez D (2012) Video surveillance based on cloud storage. In: Proceedings of the IEEE 5th International Conference on in Cloud Computing (CLOUD’12), pp 991–992

  36. Rouas J, Louradour J, Ambellouis S (2006) Audio events detection in public transport vehicle. In: Proc. of the 9th international IEEE conference on intelligent transportation systems

  37. Salamon J, Jacoby C, Bello JP (2014) A dataset and taxonomy for urban sound research. Proceedings of the 22nd ACM international conference on multimedia, November 03-07.

  38. Sharaff A, Gupta H (2019)Extra-tree classifier with metaheuristics approach for email classification. In: Bhatia S, Tiwari S, Mishra K, Trivedi M (eds) Advances in computer communication and computational sciences. Advances in intelligent systems and computing, vol 924. Springer, Singapore. https://doi.org/10.1007/978-981-13-6861-5_17

    Chapter  Google Scholar 

  39. Sharaff A, Nagwani NK (2020) ML-EC2: An Algorithm for Multi-Label Email Classification Using Clustering, International Journal of Web-Based Learning and Teaching Technologies (IJWLTT)

  40. Thumwarin P, Matsuura T, Yakoompai K (2014) Audio forensics from gunshot for firearm identification. In: Proc. IEEE 4th joint international conference on information and communication technology electronic and electrical engineering Tailand, pp 1–4

    Google Scholar 

  41. Tian YL, Brown L, Hampapur A, Lu M, Senior A, Shu CF (2008) IBM smart surveillance system (S3): event based video surveillance system with an open and extensible framework. Mach Vis Appl 19(5–6):315–327

    Article  Google Scholar 

  42. Vacher M, Portet F, Fleury A, Noury N (2010) Challenges in the processing of audio channels for ambient assisted living. In 2010 12th IEEE Int. Conf. on e-Health Networking Applications and Services (Healthcom), IEEE, pp 330–337

  43. Valera M, Velastin SA (2005) Intelligent distributed surveillance systems: A review. IEE Proc-Vis Image Signal Process 152(2):192–204. https://doi.org/10.1049/ip-vis:20041147

    Article  Google Scholar 

  44. Wang JC, Lee HP, Wang JF, Lin CB (2008) Robust environmental sound recognition for home automation. Automation Science and Engineering, IEEE Transactions on 5(1):25–31

    Article  Google Scholar 

  45. Wang Y-K, Fan CT, Huang CR (2012) A large scale video surveillance system with heterogeneous information fusion and visualization for wide area monitoring. In: Proceedings of the 8th International Conference on Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP ′12)

    Google Scholar 

  46. Weninger F, Schuller B (2011) Audio recognition in the wild: static and dynamic classification on a real-world database of animal vocalizations. In: in 2011 IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASSP). IEEE, pp 337–340

    Chapter  Google Scholar 

  47. Yamakawa N, Takahashi T, Kitahara T, Ogata T, Okuno HG (2011) Environmental sound recognition for robot audition using matching-pursuit. In: International Conference on Industrial, Engineering and Other Applications of Applied Intelligent Systems. Springer, Berlin Heidelberg, pp 1–10

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yüksel Arslan.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Arslan, Y., Canbolat, H. Sound based alarming based video surveillance system design. Multimed Tools Appl 81, 7969–7991 (2022). https://doi.org/10.1007/s11042-022-12028-6

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-12028-6

Keywords

Navigation