Abstract
In this paper, we proposed a unified framework for anomaly detection and localization in crowed scenes. For each video frame, we extract the spatio-temporal sparse features of 3D blocks and generate the saliency map using a block-based center-surround difference operator. Two sparse coding strategies including off-line long-term sparse representation and on-line short-term sparse representation are integrated within our framework. Abnormality of each candidate is measured using bottom-up saliency and top-down fixation inference and further used to classify the frames into normal and anomalous ones by a binary classifier. Local abnormal events are localized and segmented based on the saliency map. In the experiments, we compared our method against several state-of-the-art approaches on UCSD data set which is a widely used anomaly detection and localization benchmark. Our method outputs competitive results with near real-time processing speed compared to state-of-the-arts.













Similar content being viewed by others
References
Adam A, Rivlin E, Shimshoni I, Reinitz D (2008) Robust real-time unusual event detection using multiple fixed location monitors. IEEE Trans Pattern Anal Mach Intell 30(3):555–560
Boiman O, Irani M (2007) Detecting irregularities in images and in video. IJCV 74(1):17–31
Bruce N, Tsotsos J (2009) Saliency, attention, and visual search: an information thretic approach. J Vis 9(3):1–24
Bruce N, Tsotsos J (2006) Saliency based on information maximization. In: Advances in neural information processing systems (NIPS), pp 155–162
Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Duan L, Wu C, Miao J, Qing L, Fu Y (2011) Visual saliency detection by spatially weighted dissimilarity. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 441–448
Gao D, Mahadevan V, Vasconcelos N (2007) The discriminate center-surround hypothesis for bottom-up saliency. In: Advances in neural information processing systems, pp 497–504
Guo C, Ma Q, Zhang L (2008) Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Haering N, Venetianer P, Lipton A (2008) The evolution of video surveillance: an overview. Mach Vis Appl 19(5–6):279–290
Hou X, Zhang L (2006) Dynamic visual attention: searching for coding length increments. In: Advances in neural information processing systems (NIPS), pp 681–688
Hou X, Zhang L (2007) Saliency detection: a spectral residual approach. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Hyvärinen A, Oja E (1997) A fast fixed-point algorithm for independent component analysis. Neural Comput 9:1483–1492
Itti L, Koch C, Niebur E (1998) A model of saliency-based visual attention for rapid scene analysis. IEEE Trans Pattern Anal Mach Intell 20(11):1254–1259
Jiang F, Wu Y, Katsaggelos A (2009) A dynamic hierarchical clustering method for trajectory-based unusual video event detection. IEEE Trans Image Process 18(4):907–913
Judd T, Ehinger K, Durand F, Torralba A (2009) Learning to predict where humans look. In: IEEE international conference on computer vision (ICCV), pp 2106–2113
Kim J, Grauman K (2009) Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2921–2928
Kratz L, Nishino K (2009) Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1446–1453
Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) Anomaly detection in crowded scenes. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 935–942
Olshausen BA, FieldB DJ (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381(6583):607–609
Saligrama V, Chen Z (2012) Video anomaly detection based on local statistical aggregates. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2112–2119
Seidenari L, Bertini M (2010) Non-parametric anomaly detection exploiting space-time features. ACM Multimed 1139–1142
Sun X, Yao H, Ji R, Liu X, Xu P (2011) Unsupervised fast anomaly detection in crowds. In: Proceedings of the 19th ACM international conference on multimedia, pp 1469–1472
Wang W, Chen C, Wang Y, Jiang T, Fang F, Yao Y (2011) Simulating human saccadic scanpaths on natural images. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 441–448
Wang W, Wang Y, Huang Q, Gao W (2010) Measuring visual saliency by site entropy rate. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 2368–2375
Wright J, Ma Y, Mairal J, Sapiro G, Huang TS, Yan S (2010) Sparse representation for computer vision and pattern recognition. Proc IEEE 98(6):1031–1044
Zhang L, Tong M, Marks T, Shan H, Cottrell G (2008) Sun: a bayesian framework for saliency using natural statistics. J Vis 8(7):1–20
Zhao Q, Koch C (2011) Learning a saliency map using fixated locations in natural scenes. J Vis 11(3)
Acknowledgments
The work was supported in part by the Special Social Science Foundation of Heilongjiang Province of China No. 11D083, and the National Science Foundation of China No. 61472103, 2015BAF32B01-4.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Wang, C., Yao, H. & Sun, X. Anomaly detection based on spatio-temporal sparse representation and visual attention analysis. Multimed Tools Appl 76, 6263–6279 (2017). https://doi.org/10.1007/s11042-015-3199-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-015-3199-8