skip to main content
10.1145/3503161.3548199acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos

Authors Info & Claims
Published:10 October 2022Publication History

ABSTRACT

Anomaly detection in surveillance videos is an important topic in the multimedia community, which requires efficient scene context extraction and the capture of temporal information as a basis for decision. From the perspective of hierarchical modeling, we parse the surveillance scene from global to local and propose a Hierarchical Scene Normality-Binding Modeling framework (HSNBM) to handle anomaly detection. For the static background hierarchy, we design a Region Clustering-driven Multi-task Memory Autoencoder (RCM-MemAE), which can simultaneously perform region segmentation and scene reconstruction. The normal prototypes of each local region are stored, and the frame reconstruction error is subsequently amplified by global memory augmentation. For the dynamic foreground object hierarchy, we employ a Scene-Object Binding Frame Prediction module (SOB-FP) to bind all foreground objects in the frame with the prototypes stored in the background hierarchy according their positions, thus fully exploit the normality relationship between foreground and background. The bound features are then fed into the decoder to predict the future movement of the objects. With the binding mechanism between foreground and background, HSNBM effectively integrates the "reconstruction" and "prediction" tasks and builds a semantic bridge between the two hierarchies. Finally, HSNBM fuses the anomaly scores of the two hierarchies to make a comprehensive decision. Extensive empirical studies on three standard video anomaly detection datasets demonstrate the effectiveness of the proposed HSNBM framework.

Skip Supplemental Material Section

Supplemental Material

References

  1. Ruichu Cai, Hao Zhang, Wen Liu, Shenghua Gao, and Zhifeng Hao. 2021. Appearance-motion memory consistency network for video anomaly detection. In Proc. AAAI. 938--946.Google ScholarGoogle ScholarCross RefCross Ref
  2. Mathilde Caron, Piotr Bojanowski, Armand Joulin, and Matthijs Douze. 2018. Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV). 132--149.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Raghavendra Chalapathy and Sanjay Chawla. 2019. Deep learning for anomaly detection: A survey. arXiv preprint arXiv:1901.03407 (2019).Google ScholarGoogle Scholar
  4. Yunpeng Chang, Zhigang Tu, Wei Xie, and Junsong Yuan. 2020. Clustering driven deep autoencoder for video anomaly detection. In European Conference on Computer Vision. Springer, 329--345.Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Dongyue Chen, Lingyi Yue, Xingya Chang, Ming Xu, and Tong Jia. 2021. NM-GAN: Noise-modulated generative adversarial network for video anomaly detection. Pattern Recognition, Vol. 116 (2021), 107969.Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam. 2018. Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV). 801--818.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. Jang Hyun Cho, Utkarsh Mall, Kavita Bala, and Bharath Hariharan. 2021. Picie: Unsupervised semantic segmentation using invariance and equivariance in clustering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16794--16804.Google ScholarGoogle Scholar
  8. Jia-Chang Feng, Fa-Ting Hong, and Wei-Shi Zheng. 2021a. Mist: Multiple instance self-training framework for video anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14009--14018.Google ScholarGoogle ScholarCross RefCross Ref
  9. Xinyang Feng, Dongjin Song, Yuncong Chen, Zhengzhang Chen, Jingchao Ni, and Haifeng Chen. 2021b. Convolutional Transformer based Dual Discriminator Generative Adversarial Networks for Video Anomaly Detection. In Proceedings of the 29th ACM International Conference on Multimedia. 5546--5554.Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Jie Gao, Licheng Jiao, Fang Liu, Shuyuan Yang, Biao Hou, and Xu Liu. 2021. Multiscale Curvelet Scattering Network. IEEE Transactions on Neural Networks and Learning Systems (2021).Google ScholarGoogle ScholarCross RefCross Ref
  11. Mariana-Iuliana Georgescu, Antonio Barbalau, Radu Tudor Ionescu, Fahad Shahbaz Khan, Marius Popescu, and Mubarak Shah. 2021. Anomaly detection in video via self-supervised and multi-task learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12742--12752.Google ScholarGoogle ScholarCross RefCross Ref
  12. Dong Gong, Lingqiao Liu, Vuong Le, Budhaditya Saha, Moussa Reda Mansour, Svetha Venkatesh, and Anton van den Hengel. 2019. Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1705--1714.Google ScholarGoogle Scholar
  13. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. Advances in neural information processing systems, Vol. 27 (2014).Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. Alex Graves, Greg Wayne, Malcolm Reynolds, Tim Harley, Ivo Danihelka, Agnieszka Grabska-Barwi'nska, Sergio Gómez Colmenarejo, Edward Grefenstette, Tiago Ramalho, John Agapiou, et al. 2016. Hybrid computing using a neural network with dynamic external memory. Nature, Vol. 538, 7626 (2016), 471--476.Google ScholarGoogle Scholar
  15. Zhicheng Guo, Jiaxuan Zhao, Licheng Jiao, Xu Liu, and Fang Liu. 2021. A Universal Quaternion Hypergraph Network for Multimodal Video Question Answering. IEEE Transactions on Multimedia (2021).Google ScholarGoogle ScholarCross RefCross Ref
  16. Mahmudul Hasan, Jonghyun Choi, Jan Neumann, Amit K Roy-Chowdhury, and Larry S Davis. 2016. Learning temporal regularity in video sequences. In Proceedings of the IEEE conference on computer vision and pattern recognition. 733--742.Google ScholarGoogle ScholarCross RefCross Ref
  17. Ryota Hinami, Tao Mei, and Shin'ichi Satoh. 2017. Joint detection and recounting of abnormal events by learning deep generic knowledge. In Proceedings of the IEEE international conference on computer vision. 3619--3627.Google ScholarGoogle ScholarCross RefCross Ref
  18. Radu Tudor Ionescu, Fahad Shahbaz Khan, Mariana-Iuliana Georgescu, and Ling Shao. 2019. Object-centric auto-encoders and dummy anomalies for abnormal event detection in video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7842--7851.Google ScholarGoogle ScholarCross RefCross Ref
  19. Licheng Jiao, Ronghua Shang, Fang Liu, and Weitong Zhang. 2020. Brain and Nature-Inspired Learning, Computation and Recognition. Elsevier.Google ScholarGoogle Scholar
  20. Licheng Jiao, Ruohan Zhang, Fang Liu, Shuyuan Yang, Biao Hou, Lingling Li, and Xu Tang. 2021. New generation deep learning for video object detection: A survey. IEEE Transactions on Neural Networks and Learning Systems (2021).Google ScholarGoogle Scholar
  21. Jaechul Kim and Kristen Grauman. 2009. Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. In 2009 IEEE conference on computer vision and pattern recognition. IEEE, 2921--2928.Google ScholarGoogle ScholarCross RefCross Ref
  22. Sangmin Lee, Hak Gu Kim, Dae Hwi Choi, Hyung-Il Kim, and Yong Man Ro. 2021. Video prediction recalling long-term motion context via memory alignment learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3054--3063.Google ScholarGoogle ScholarCross RefCross Ref
  23. Sangho Lee, Jinyoung Sung, Youngjae Yu, and Gunhee Kim. 2018. A memory network approach for story-based temporal summarization of 360 videos. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1410--1419.Google ScholarGoogle ScholarCross RefCross Ref
  24. Shuo Li, Fang Liu, and Licheng Jiao. 2022. Self-training multi-sequence learning with Transformer for weakly supervised video anomaly detection. Proceedings of the AAAI, Virtual, Vol. 24 (2022).Google ScholarGoogle ScholarCross RefCross Ref
  25. Weixin Li, Vijay Mahadevan, and Nuno Vasconcelos. 2013. Anomaly detection and localization in crowded scenes. IEEE transactions on pattern analysis and machine intelligence, Vol. 36, 1 (2013), 18--32.Google ScholarGoogle Scholar
  26. Fang Liu, Xiaoxue Qian, Licheng Jiao, Xiangrong Zhang, Lingling Li, and Yuanhao Cui. 2022. Contrastive Learning-Based Dual Dynamic GCN for SAR Image Scene Classification. IEEE Transactions on Neural Networks and Learning Systems (2022).Google ScholarGoogle ScholarCross RefCross Ref
  27. Wen Liu, Weixin Luo, Dongze Lian, and Shenghua Gao. 2018. Future frame prediction for anomaly detection--a new baseline. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6536--6545.Google ScholarGoogle ScholarCross RefCross Ref
  28. Zhian Liu, Yongwei Nie, Chengjiang Long, Qing Zhang, and Guiqing Li. 2021. A Hybrid Video Anomaly Detection Framework via Memory-Augmented Flow Reconstruction and Flow-Guided Frame Prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13588--13597.Google ScholarGoogle ScholarCross RefCross Ref
  29. Cewu Lu, Jianping Shi, and Jiaya Jia. 2013. Abnormal event detection at 150 fps in matlab. In Proceedings of the IEEE international conference on computer vision. 2720--2727.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Yiwei Lu, K Mahesh Kumar, Seyed shahabeddin Nabavi, and Yang Wang. 2019. Future frame prediction using convolutional vrnn for anomaly detection. In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS). IEEE, 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  31. Weixin Luo, Wen Liu, and Shenghua Gao. 2017a. Remembering history with convolutional lstm for anomaly detection. In 2017 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 439--444.Google ScholarGoogle ScholarCross RefCross Ref
  32. Weixin Luo, Wen Liu, and Shenghua Gao. 2017b. A revisit of sparse coding based anomaly detection in stacked rnn framework. In Proceedings of the IEEE international conference on computer vision. 341--349.Google ScholarGoogle ScholarCross RefCross Ref
  33. Yawei Luo, Liang Zheng, Tao Guan, Junqing Yu, and Yi Yang. 2019. Taking a closer look at domain shift: Category-level adversaries for semantics consistent domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2507--2516.Google ScholarGoogle ScholarCross RefCross Ref
  34. Vijay Mahadevan, Weixin Li, Viral Bhalodia, and Nuno Vasconcelos. 2010. Anomaly detection in crowded scenes. In 2010 IEEE computer society conference on computer vision and pattern recognition. IEEE, 1975--1981.Google ScholarGoogle Scholar
  35. Trong-Nguyen Nguyen and Jean Meunier. 2019. Anomaly detection in video sequence with appearance-motion correspondence. In Proceedings of the IEEE/CVF international conference on computer vision. 1273--1283.Google ScholarGoogle ScholarCross RefCross Ref
  36. Hyunjong Park, Jongyoun Noh, and Bumsub Ham. 2020. Learning memory-guided normality for anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14372--14381.Google ScholarGoogle ScholarCross RefCross Ref
  37. Xiaoxue Qian, Fang Liu, Licheng Jiao, Xiangrong Zhang, Puhua Chen, Lingling Li, Jing Gu, and Yuanhao Cui. 2021. A Hybrid Network With Structural Constraints for SAR Image Scene Classification. IEEE Transactions on Geoscience and Remote Sensing, Vol. 60 (2021), 1--17.Google ScholarGoogle Scholar
  38. Waqas Sultani, Chen Chen, and Mubarak Shah. 2018. Real-world anomaly detection in surveillance videos. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6479--6488.Google ScholarGoogle ScholarCross RefCross Ref
  39. Che Sun, Yunde Jia, Yao Hu, and Yuwei Wu. 2020. Scene-aware context reasoning for unsupervised abnormal event detection in videos. In Proceedings of the 28th ACM International Conference on Multimedia. 184--192.Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. Yao Tang, Lin Zhao, Shanshan Zhang, Chen Gong, Guangyu Li, and Jian Yang. 2020. Integrating prediction and reconstruction for anomaly detection. Pattern Recognition Letters, Vol. 129 (2020), 123--130.Google ScholarGoogle ScholarCross RefCross Ref
  41. Yu Tian, Guansong Pang, Yuanhong Chen, Rajvinder Singh, Johan W Verjans, and Gustavo Carneiro. 2021. Weakly-supervised video anomaly detection with robust temporal feature magnitude learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4975--4986.Google ScholarGoogle ScholarCross RefCross Ref
  42. Xuanzhao Wang, Zhengping Che, Bo Jiang, Ning Xiao, Ke Yang, Jian Tang, Jieping Ye, Jingyu Wang, and Qi Qi. 2021. Robust unsupervised video anomaly detection by multipath frame prediction. IEEE Transactions on Neural Networks and Learning Systems (2021).Google ScholarGoogle Scholar
  43. Zitong Wu, Biao Hou, and Licheng Jiao. 2020. Multiscale CNN with autoencoder regularization joint contextual attention network for SAR image classification. IEEE Transactions on Geoscience and Remote Sensing, Vol. 59, 2 (2020), 1200--1213.Google ScholarGoogle ScholarCross RefCross Ref
  44. Muchao Ye, Xiaojiang Peng, Weihao Gan, Wei Wu, and Yu Qiao. 2019. Anopcn: Video anomaly detection via deep predictive coding network. In Proceedings of the 27th ACM International Conference on Multimedia. 1805--1813.Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. Guang Yu, Siqi Wang, Zhiping Cai, En Zhu, Chuanfu Xu, Jianping Yin, and Marius Kloft. 2020. Cloze test helps: Effective video anomaly detection via learning to complete video events. In Proceedings of the 28th ACM International Conference on Multimedia. 583--591.Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. Jongmin Yu, Younkwan Lee, Kin Choong Yow, Moongu Jeon, and Witold Pedrycz. 2021. Abnormal event detection and localization via adversarial event prediction. IEEE Transactions on Neural Networks and Learning Systems (2021).Google ScholarGoogle Scholar
  47. Muhammad Zaigham Zaheer, Jin-ha Lee, Marcella Astrid, and Seung-Ik Lee. 2020. Old is gold: Redefining the adversarially learned one-class classifier training paradigm. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 14183--14193.Google ScholarGoogle Scholar
  48. Fan Zhang, Yanqin Chen, Zhihang Li, Zhibin Hong, Jingtuo Liu, Feifei Ma, Junyu Han, and Errui Ding. 2019. Acfnet: Attentional class feature network for semantic segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6798--6807.Google ScholarGoogle ScholarCross RefCross Ref
  49. Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. 2017b. Pyramid scene parsing network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2881--2890.Google ScholarGoogle ScholarCross RefCross Ref
  50. Yiru Zhao, Bing Deng, Chen Shen, Yao Liu, Hongtao Lu, and Xian-Sheng Hua. 2017a. Spatio-temporal autoencoder for video anomaly detection. In Proceedings of the 25th ACM international conference on Multimedia. 1933--1941.Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. Yuanhong Zhong, Xia Chen, Jinyang Jiang, and Fan Ren. 2022. A cascade reconstruction model with generalization ability evaluation for anomaly detection in videos. Pattern Recognition, Vol. 122 (2022), 108336.Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. Bolei Zhou, Hang Zhao, Xavier Puig, Sanja Fidler, Adela Barriuso, and Antonio Torralba. 2017. Scene parsing through ade20k dataset. In Proceedings of the IEEE conference on computer vision and pattern recognition. 633--641.Google ScholarGoogle ScholarCross RefCross Ref
  53. Joey Tianyi Zhou, Le Zhang, Zhiwen Fang, Jiawei Du, Xi Peng, and Yang Xiao. 2019. Attention-driven loss for anomaly detection in video surveillance. IEEE transactions on circuits and systems for video technology, Vol. 30, 12 (2019), 4639--4647.Google ScholarGoogle Scholar

Index Terms

  1. Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      MM '22: Proceedings of the 30th ACM International Conference on Multimedia
      October 2022
      7537 pages
      ISBN:9781450392037
      DOI:10.1145/3503161

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 10 October 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate995of4,171submissions,24%

      Upcoming Conference

      MM '24
      MM '24: The 32nd ACM International Conference on Multimedia
      October 28 - November 1, 2024
      Melbourne , VIC , Australia

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader