Abstract
This paper aims to address the problem of modeling human behavior patterns captured in surveillance videos for the application of online normal behavior recognition and anomaly detection. A novel framework is developed for automatic behavior modeling and online anomaly detection without the need for manual labeling of the training data set. The framework consists of the following key components. 1) A compact and effective behavior representation method is developed based on spatial-temporal interest point detection. 2) The natural grouping of behavior patterns is determined through a novel clustering algorithm, topic hidden Markov model (THMM) built upon the existing hidden Markov model (HMM) and latent Dirichlet allocation (LDA), which overcomes the current limitations in accuracy, robustness, and computational efficiency. The new model is a four-level hierarchical Bayesian model, in which each video is modeled as a Markov chain of behavior patterns where each behavior pattern is a distribution over some segments of the video. Each of these segments in the video can be modeled as a mixture of actions where each action is a distribution over spatial-temporal words. 3) An online anomaly measure is introduced to detect abnormal behavior, whereas normal behavior is recognized by runtime accumulative visual evidence using the likelihood ratio test (LRT) method. Experimental results demonstrate the effectiveness and robustness of our approach using noisy and sparse data sets collected from a real surveillance scenario.
Similar content being viewed by others
References
Dollar P, Rabaud V, Cottrell G, Belongie S. Behavior recognition via sparse spatio-temporal features. In: Proceedings of 14th International Conference on Computer Communications and Networks. 2005, 65–72
Yilmaz A. Shah M. Recognizing human actions in videos acquired by uncalibrated moving cameras. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 150–157
Song Y, Goncalves L, Perona P. Unsupervised learning of human motion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(7): 814–827
Fanti C, Zelnik-Manor L, Perona P. Hybrid models for human motion recognition. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 1166–1173
Zhong H, Shi J, Visontai M. Detecting unusual activity in video. In: Proceedings of 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004, 819–826
Niebles J C, Wang H C, Li F F. Unsupervised learning of human action categories using spatial-temporal words. International Journal of Computer Vision, 2008, 79(3): 299–318
Wallach H M. Topic modeling: beyond bag-of-words. In: Proceedings of 23rd International Conference on Machine Learning. 2006, 977–984
Wang X, McCallum A. A note on topical n-grams. Technical Report UM-CS-071. Department of Computer Science University of Massachusetts Amherst, 2005
Gruber A, Rosen-Zvi M, Weiss Y. Hidden topic Markov models. In: Proceedings of Artificial Intelligence and Statistics. 2007
Boiman O, Irani M. Detecting irregularities in images and in video. In: Proceedings of 10th IEEE International Conference on Computer Vision. 2005, 462–469
Oliver N M, Rosario B, Pentland A P. ABayesian computer vision system for modeling human interactions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(8): 831–843
Zelnik-Manor L, Irani M. Event-based analysis of video. In: Proceedings of 2001 IEEE Conference on Computer Vision and Pattern Recognition. 2001, 123–130
Hongeng S. Nevatia R. Multi-agent event recognition. In: Proceedings of 8th IEEE International Conference on Computer Vision. 2001, 84–91
Russo R, Shah M, Lobo N. A computer vision system for monitoring production of fast food. In: Proceedings of 5th Asian Conference on Computer Vision, Melbourne. 2002, 23–25
Johnson N, Hogg D. Learning the distribution of object trajectories for event recognition. Image and Vision Computing, 1995, 14(8): 609–615
Brand M, Oliver N, Pentland A. Coupled hidden Markov models for complex action recognition. In: Proceedings of 1997 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1997, 994–999
Medioni G, Cohen I, Bremond F, Hongeng S, Nevatia R. Event detection and analysis from video streams. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2001, 23(8): 873–889
Naphide H R, Huang T S. A probabilistic framework for semantic video indexing, filtering, and retrieval. IEEE Transactions on Multimedia, 2001, 3(1): 141–151
Hamid R, Johnson A, Batta S, Bobick A, Isbell C, Coleman G. Detection and explanation of anomalous activities: representing activities as bags of event n-grams. In: Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2005, 1031–1038
Xiang T, Gong S. Beyond tracking: modelling activity and understanding behaviour. International Journal of Computer Vision, 2006, 67(1): 21–51
Wilpon J G, Rabiner L R, Lee C H, Goldman E R. Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 1990, 38(11): 1870–1878
Blei DM, Ng A Y, Jordan MI, Lafferty J. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993–1022
Griffiths T L, Steyvers M. Finding scientific topic. In: Proceedings of the National Academy of Sciences of the United States of America. 2004, 5228–5235
Author information
Authors and Affiliations
Corresponding author
Additional information
Xudong Zhu is a PhD candidate at School of Computer Science and Technology, Xidian University. He received his B S degree from Xidian University in 1996, the M.Sc degree in computer science from Northwestern Polytechnic University in 2005. His research interest is data mining.
Zhijing Liu is a Professor and advisor for doctoral students at School of Computer Science and Technology, Xidian University. He received his bachelor degree from Xidian University in 1982. His research works focus on the fields of vision computing technologies, network multimedia technologies, technologies of virtual reality, and key technologies of Egovernment and E-commerce.
Rights and permissions
About this article
Cite this article
Zhu, X., Liu, Z. Human behavior clustering for anomaly detection. Front. Comput. Sci. China 5, 279–289 (2011). https://doi.org/10.1007/s11704-011-0080-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11704-011-0080-4