Abstract
Large-scale multimedia surveillance installations usually consist of a number of spatially distributed video cameras that are installed in a premise and are connected to a central control station, where human operators (e.g., security personnel) remotely monitor the scene images captured by the cameras. In the majority of these systems the ratio of human operators to the number of camera views is very low. This potentially raises the problem that some important events may be missed. Studies have shown that a human operator can effectively monitor only four camera views. Moreover, the visual attention of human operator drops below the acceptable level while performing the task of visual monitoring. Therefore, there is a need for the selection of the four most relevant camera views at a given time instant. This paper proposes a human-centric approach to solve the problem of dynamically selecting and scheduling the four best camera views. In the proposed approach we use a feedback camera to observe the human monitoring the surveillance camera feeds. Using this information, the system computes the operator’s attention to the camera views to automatically determine the importance of events being captured by the respective cameras. This real-time non-invasive relevance feedback is then augmented with the automatic detection of events to compute the four best feeds. The experiments show the effectiveness of the proposed approach by improving the identification of important events occurring in the environment.
Similar content being viewed by others
References
Amarnag S, Kumaran RS, Gowdy JN (2003) Real time eye tracking for human computer interfaces. In: IEEE international conference on multimedia and expo. Washington DC, USA, pp 557–560
Asteriadis S, Tzouveli P, Karpouzis K, Kollias S (2009) Estimation of behavioral user state based on eye gaze and head pose application in an e-learning environment. Multimed Tools Appl 41(3):469–493
Atrey PK (2009) A hierarchical model for representation of events in multimedia observation systems. In: The 1st ACM international workshop on events in multimedia. Beijing, China, pp 57–64
Atrey PK, Hossain MA, Saddik AE (2008) Automatic scheduling of cctv camera views using a human-centric approach. In: IEEE international conference on multimedia and expo. Hannover, Germany, pp 325–338
Atrey PK, Kankanhalli MS, Jain R (2006) Information assimilation framework for event detection in multimedia surveillance systems. Springer/ACM Multimed Syst J 12(3):239–253
Baumann MA, MacLean KE, Hazelton TW, McKay A (2010) Emulating human attention-getting practices with wearable haptics. In: IEEE haptics symposium. Waltham, USA, pp 149–156
Davis M (2003) Active capture: integrating human-computer interaction and computer vision/audition to automate media capture. In: IEEE international conference on multimedia and expo, vol 2, pp 185–188
Dee HM, Velastin SA (2007) How close are we to solving the problem of automated visual surveillance: a review of real-world surveillance, scientific progress and evaluative mechanisms. Mach Vis Appl 19(5–6):329–343
Hampapur A, Brown L, Connell J, Ekin A, Haas N, Lu M, Merkl H, Pankanti S, Senior A, Shu CF, Tian YL (2005) Smart video surveillance: exploring the concept of multiscale spatiotemporal tracking. IEEE Signal Process Mag 22(2):38–51
Hossain MA, Atrey PK, Saddik, AE (2011) Modeling and assessing quality of information in multi-sensor multimedia monitoring systems. ACM Trans Multimed Comput Commun Appl 7(1)
Itti L, Baldi P (2009) Bayesian surprise attracts human attention. Vision Res 49(10):1295–1306
Itti L, Koch C (2001) Computational modelling of visual attention. Nat Rev Neurosci 2:194–203
Leykin A, Hammoud R (2008) Real-time estimation of human attention field in LWIR and color surveillance videos. In: IEEE international workshop on object tracking and classification in and beyond the visible spectrum. Anchorage, USA, pp 1–6
Liu A, Zhang Y, Song Y, Zhang D, Li J, Yang Z (2008) Human attention model for semantic scene analysis in movies. In: IEEE international conference on multimedia and expo. Hannover, Germany, pp 1473–1476
Ma YF, Lu L, Zhang HJ, Li M (2002) A user attention model for video summarization. In: ACM international conference on multimedia, pp 533–542
Menezes P, Barreto JC, Dias J (2004) Face tracking based on haar-like features and eigenfaces. In: The 5th symposium on intelligent autonomous vehicles, pp 5–7
Peters C, O’Sullivan C (2003) Attention-driven eye gaze and blinking for virtual humans. In: ACM SIGGRAPH 2003 sketches & applications. San Diego, USA, pp 1–1
Radke RJ, Andra S, Al-Kofahi O, Roysam B (2005) Image change detection algorithms: a systematic survey. IEEE Trans Image Process 14(3):294–307
Reinders M (1997) Eye tracking by template matching using an automatic codebook generation scheme. In: Third annual conference of the advanced school for computing and imaging. Heijen, The Netherlands, pp 85–91
Rowe LA, Jain R (2005) ACM SIGMM retreat report on future directions in multimedia research. ACM Trans Multimed Comput Commun Appl 1(1):3–13
Savas Z (2005) Real-time detection and tracking of human eyes in video sequences. MSc thesis, Middle East Technical University, Ankara, Turkey
Savas Z (2008) Trackeye: real-time tracking of human eyes using a webcam. http://www.codeproject.com/KB/cpp/TrackEye.aspx
Smith P, Shah M, da Vitoria Lobo N (2000) Monitoring head/eye motion for driver alertness with one camera. In: IEEE international conference on pattern recognition. Barcelona, Spain, pp 636–642
Stauffer C, Grimson WEL (1999) Adaptive background mixture models for real-time tracking. In: IEEE Computer Society conference on computer vision and pattern recognition, vol 2. Ft. Collins, CO, USA, pp 252–258
Taylor JG, Fragopanagos N (2004) Modelling human attention and emotions. In: IEEE international joint conference on neural networks, vol 1. Budapest, Hungary, pp 501–506
Vaiapury K, Kankanhalli M (2008) Finding interesting images in albums using attention. J Multimedia 3(4):1–12
Vilaplana V, Marques F (2008) Region-based mean shift tracking: application to face tracking. In: The 15th IEEE international conference on image processing. San Diego, CA, pp 2712–2715
Vural U, Akgul YS (2009) Eye-gaze based real-time surveillance video synopsis. Pattern Recogn Lett 30:1151–1159
Wallace E, Diffey C (1988) CCTV control room ergonomics. Tech. rep., Police Scientific Development Branch, UK Home Office
Wang J, Kankanhalli MS, Yan W, Jain R (2003) Experiential sampling for video surveillance. In: First ACM international workshop on video surveillance. Berkeley, California, USA, pp 77–86
Wu C, Lin Y, Zhang WJ (2005) Human attention modeling in a human-machine interface based on the incorporation of contextual features in a Bayesian network. In: IEEE international conference on systems, man and cybernetics, vol 1. San Antonio, USA, pp 760–766
Wu J, Trivedi MM (2010) An eye localization, tracking and blink pattern recognition system: algorithm and evaluation. ACM Trans Multimed Comput Commun Appl 6(2):1–23
Author information
Authors and Affiliations
Corresponding author
Additional information
This work was supported by the National Sciences and Engineering Research Council of Canada Discovery Grant 408206 and the University of Winnipeg Major Research Grant 607062.
Rights and permissions
About this article
Cite this article
Atrey, P.K., El Saddik, A. & Kankanhalli, M.S. Effective multimedia surveillance using a human-centric approach. Multimed Tools Appl 51, 697–721 (2011). https://doi.org/10.1007/s11042-010-0649-1
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-010-0649-1