Skip to main content

Spatial Attention Transformer Based Framework for Anomaly Classification in Image Sequences

  • Conference paper
  • First Online:
Intelligent Human Computer Interaction (IHCI 2023)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14532))

Included in the following conference series:

  • 82 Accesses

Abstract

With the increasing number of crimes in crowded and remote areas, there is a necessity to recognize any abnormal or violent event with the help of video surveillance systems. Anomaly detection is still a challenging task in the domain of computer vision because of its changing color, backgrounds, and illuminations. In recent years, vision transformers, along with the introduction of attention modules in deep learning algorithms showed promising results. This paper presents an attention-based anomaly detection framework that focuses on the extraction of spatial features. The proposed framework is implemented in two steps. The first step involves the extraction of spatial features with the Spatial Attention Module (SAM) and Shifted Window (SWIN) transformer. In the second step, a binary classification of abnormal or violent activities is done with extracted features via fully connected layers. A performance analysis of pretrained variants of SWIN transformers is also presented in this paper for the choice of the model. Four public benchmark datasets, namely, CUHK Avenue, University of Minnesota (UMN), AIRTLab, and Industrial Surveillance (IS) are employed for analysis and implementations. The proposed framework outperformed existing state of the art methods by 18% and 2–20% with accuracy of 98.58% (IS) and 100% (Avenue) respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 74.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Franklin, R.J., Dabbagol, V.: Anomaly detection in videos for video surveillance applications using neural networks. In: Fourth International Conference on Inventive Systems and Controls (ICISC), p. 632. IEEE (2020)

    Google Scholar 

  2. Anomaly Detection in Videos using LSTM Convolutional Autoencoder, https://towardsdatascience.com/prototyping-an-anomaly-detection-system-for-videos-step-by-step-using-lstm-convolutional-4e06b7dcdd29. Last accessed 25 Apr 2023

  3. Garg, A., Nigam, S., Singh, R.: Vision based human activity recognition using hybrid deep learning. In: 2022 International Conference on Connected Systems & Intelligence (CSI), pp. 1–6. IEEE (2022)

    Google Scholar 

  4. Berroukham, A., Housni, K., Lahraichi, M., Boulfrifi, I.: Deep learning-based method for anomaly detection in video surveillance: a review. Bull. Electr. Eng. Inf. 2(1), 314–327 (2023)

    Google Scholar 

  5. Suarez, J.J.P., Naval Jr, P.C.: A survey on deep learning techniques for video anomaly detection. arXiv preprint arXiv: 2009.14146 (2020)

    Google Scholar 

  6. Ramzan, M., et al.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019)

    Article  Google Scholar 

  7. Chandrakala, S., Deepak, K., Revathy, G.: Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis. Artif. Intell. Rev. 1–50 (2022)

    Google Scholar 

  8. Jamil, S., Jalil Piran, M., Kwon, O.J.: A comprehensive survey of transformers for computer vision. Drones 7(5), 287 (2022)

    Article  Google Scholar 

  9. Nigam, S., Singh, R., Misra, A.K.: A review of computational approaches for human behavior detection. Arch. Comput. Meth. Eng. 26, 831–863 (2019)

    Google Scholar 

  10. Guo, M.H., et al.: Attention mechanisms in computer vision: a survey. Comp. Visual Media 8(3), 331–368 (2022)

    Article  MathSciNet  Google Scholar 

  11. Kukkala, V.K., Thiruloga, S.V., Pasricha, S.: Latte: LSTM self-attention based anomaly detection in embedded automotive platforms. ACM Trans. Embedded Comput. Syst. 20(5s), 1–23 (2021)

    Article  Google Scholar 

  12. Ma, H., Zhang, L.: Attention-based framework for weakly supervised video anomaly detection. J. Supercomput. 78(6), 8409–8429 (2022)

    Google Scholar 

  13. Nasaruddin, N., Muchtar, K., Afdhal, A., Dwiyantoro, A.P.J.: Deep anomaly detection through visual attention in surveillance videos. J. Big Data 7(1), 1–17 (2020)

    Article  Google Scholar 

  14. Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252, 109348 (2022)

    Article  Google Scholar 

  15. Du, Z., Zhang, G., Gan, J., Wu, C., Liu, X.: VadTR: video anomaly detection with transformer. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), pp. 711–714. IEEE (2022)

    Google Scholar 

  16. Aslam, N.K., Narayanan, S., Kolekar, M.H.: Bidirectional motion learning using transformer based Siamese network for video anomaly detection (2023)

    Google Scholar 

  17. Pang, W., He, Q., Li, Y.: Predicting skeleton trajectories using a skeleton-transformer for video anomaly detection. Multimedia Syst. 28(4), 1481–1494 (2022)

    Article  Google Scholar 

  18. Monitoring Human Activity – Detection of Events. http://mha.cs.umn.edu/proj_events.shtml#crowd. Last accessed 15 Aug 2023

  19. Avenue Dataset for Abnormal Event Detection. http://www.cse.cuhk.edu.hk/leojia/projects/detectabnormal/dataset.html last accessed 2023/08/14

  20. Bianculli, M., et al.: A dataset for automatic violence detection in videos. Data in Brief 33, 106587 (2020)

    Google Scholar 

  21. Ullah, F.U.M., et al.: AI-assisted edge vision for violence detection in IoT-based industrial surveillance networks. IEEE Trans. Industr. Inf. 18(8), 5359–5370 (2021)

    Article  Google Scholar 

  22. Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 3–19 (2018)

    Google Scholar 

  23. Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)

    Google Scholar 

  24. Nigam, S., Singh, R., Singh, M.K., Singh, V.K.: Multiview human activity recognition using uniform rotation invariant local binary patterns. J. Ambient. Intell. Humaniz. Comput. 14(5), 4705–4725 (2022)

    Google Scholar 

  25. Ghadi, Y.Y., et al.: Extrinsic behaviour prediction of pedestrian via maximum entropy Markov model and graph-based features mining. Appl. Sci. 12(12), 5985 (2022)

    Google Scholar 

  26. Alarfaj, M., et al.: Automatic anomaly monitoring in public surveillance areas. Intell. Autom. Soft Comput. 35(3), 2655–2671 (2023)

    Google Scholar 

  27. Ullah, W., Ullah, A., Hussain, T., Khan, Z.A., Baik, S.W.: An efficient anomaly recognition framework using an attention residual LSTM in surveillance videos. Sensors 21(8), 2811 (2021)

    Article  Google Scholar 

  28. Ilyas, Z., Aziz, Z., Qasim, T., Bhatti, N., Hayat, M.F.: A hybrid deep network based approach for crowd anomaly detection. Multimed. Tools Appl. 80, 24053–24067 (2021)

    Article  Google Scholar 

  29. Abdullah, F., Jalal, A.: Semantic segmentation based crowd tracking and anomaly detection via neuro-fuzzy classifier in smart surveillance system. Arab. J. Sci. Eng. 48(2), 2173–2190 (2023)

    Article  Google Scholar 

  30. Aziz, Z., Bhatti, N., Mahmood, H., Zia, M.: Video anomaly detection and localization based on appearance and motion models. Multimed. Tools Appl. 80(17), 25875–25895 (2021)

    Article  Google Scholar 

  31. Sharif, M.H., Jiao, L., Omlin, C.W.: Deep crowd anomaly detection by fusing reconstruction and prediction networks. Electronics 12(7), 1517 (2023)

    Article  Google Scholar 

  32. Deepak, K., Chandrakala, S., Mohan, C.K.: Residual spatiotemporal autoencoder for unsupervised video anomaly detection. SIViP 15(1), 215–222 (2021)

    Article  Google Scholar 

  33. Khaire, P., Kumar, P.: A semi-supervised deep learning based video anomaly detection framework using RGB-D for surveillance of real-world critical environments. Forensic Sci. Int. Digit. Investig. 40, 301346 (2022)

    Article  Google Scholar 

  34. Ehsan, T.Z., Nahvi, M., Mohtavipour, S.M.: An accurate violence detection framework using unsupervised spatial-temporal action translation network. Vis. Comput. 1–21 (2023). https://doi.org/10.1007/s00371-023-02865-3

  35. Yuan, H., Cai, Z., Zhou, H., Wang, Y., Chen, X.: Transanomaly: video anomaly detection using video vision transformer. IEEE Access 9, 123977–123986 (2021)

    Article  Google Scholar 

  36. Yang, M., et al.: Transformer-based deep learning model and video dataset for unsafe action identification in construction projects. Autom. Constr. 146, 104703 (2023)

    Article  Google Scholar 

  37. Lee, Y., Kang, P.: AnoViT: unsupervised anomaly detection and localization with vision transformer-based encoder-decoder. IEEE Access 10, 46717–46724 (2022)

    Article  Google Scholar 

  38. Ullah, W., Hussain, T., Ullah, F.U.M., Lee, M.Y., Baik, S.W.: TransCNN: hybrid CNN and transformer mechanism for surveillance anomaly detection. Eng. Appl. Artif. Intell. 123, 106173 (2023)

    Article  Google Scholar 

  39. Pillai, A., Verma, G.V., Sen, D.: Transformer based self-context aware prediction for few-shot anomaly detection in videos. In: 2022 IEEE International Conference on Image Processing (ICIP), pp. 3485–3489. IEEE (2022)

    Google Scholar 

  40. Sivalingan, H., Anandakrishnan, N.: Crowd localization and anomaly detection using video anomaly scoring network. Math. Stat. Eng. Appl. 72(1), 825–837 (2023)

    Google Scholar 

  41. Sernani, P., Falcionelli, N., Tomassini, S., Contardo, P., Dragoni, A.F.: Deep learning for automatic violence detection: Tests on the AIRTLab dataset. IEEE Access 9, 160580–160595 (2021)

    Article  Google Scholar 

  42. Kumar, A., Khari, M.: Efficient video anomaly detection using variational autoencoder. In: 2023 International Conference on Communication System, Computing and IT Applications (CSCITA), pp. 50–55. IEEE (2023)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rajiv Singh .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Garg, A., Nigam, S., Singh, R., Shastri, A., Singh, M. (2024). Spatial Attention Transformer Based Framework for Anomaly Classification in Image Sequences. In: Choi, B.J., Singh, D., Tiwary, U.S., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2023. Lecture Notes in Computer Science, vol 14532. Springer, Cham. https://doi.org/10.1007/978-3-031-53830-8_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-53830-8_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-53829-2

  • Online ISBN: 978-3-031-53830-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics