Spatial Attention Transformer Based Framework for Anomaly Classification in Image Sequences

Garg, Aishvarya; Nigam, Swati; Singh, Rajiv; Shastri, Anshuman; Singh, Madhusudan

doi:10.1007/978-3-031-53830-8_6

Aishvarya Garg^11,13,
Swati Nigam^12,13,
Rajiv Singh^12,13,
Anshuman Shastri¹³ &
…
Madhusudan Singh¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14532))

Included in the following conference series:

International Conference on Intelligent Human Computer Interaction

82 Accesses

Abstract

With the increasing number of crimes in crowded and remote areas, there is a necessity to recognize any abnormal or violent event with the help of video surveillance systems. Anomaly detection is still a challenging task in the domain of computer vision because of its changing color, backgrounds, and illuminations. In recent years, vision transformers, along with the introduction of attention modules in deep learning algorithms showed promising results. This paper presents an attention-based anomaly detection framework that focuses on the extraction of spatial features. The proposed framework is implemented in two steps. The first step involves the extraction of spatial features with the Spatial Attention Module (SAM) and Shifted Window (SWIN) transformer. In the second step, a binary classification of abnormal or violent activities is done with extracted features via fully connected layers. A performance analysis of pretrained variants of SWIN transformers is also presented in this paper for the choice of the model. Four public benchmark datasets, namely, CUHK Avenue, University of Minnesota (UMN), AIRTLab, and Industrial Surveillance (IS) are employed for analysis and implementations. The proposed framework outperformed existing state of the art methods by 18% and 2–20% with accuracy of 98.58% (IS) and 100% (Avenue) respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 74.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Franklin, R.J., Dabbagol, V.: Anomaly detection in videos for video surveillance applications using neural networks. In: Fourth International Conference on Inventive Systems and Controls (ICISC), p. 632. IEEE (2020)
Google Scholar
Anomaly Detection in Videos using LSTM Convolutional Autoencoder, https://towardsdatascience.com/prototyping-an-anomaly-detection-system-for-videos-step-by-step-using-lstm-convolutional-4e06b7dcdd29. Last accessed 25 Apr 2023
Garg, A., Nigam, S., Singh, R.: Vision based human activity recognition using hybrid deep learning. In: 2022 International Conference on Connected Systems & Intelligence (CSI), pp. 1–6. IEEE (2022)
Google Scholar
Berroukham, A., Housni, K., Lahraichi, M., Boulfrifi, I.: Deep learning-based method for anomaly detection in video surveillance: a review. Bull. Electr. Eng. Inf. 2(1), 314–327 (2023)
Google Scholar
Suarez, J.J.P., Naval Jr, P.C.: A survey on deep learning techniques for video anomaly detection. arXiv preprint arXiv: 2009.14146 (2020)
Google Scholar
Ramzan, M., et al.: A review on state-of-the-art violence detection techniques. IEEE Access 7, 107560–107575 (2019)
Article Google Scholar
Chandrakala, S., Deepak, K., Revathy, G.: Anomaly detection in surveillance videos: a thematic taxonomy of deep models, review and performance analysis. Artif. Intell. Rev. 1–50 (2022)
Google Scholar
Jamil, S., Jalil Piran, M., Kwon, O.J.: A comprehensive survey of transformers for computer vision. Drones 7(5), 287 (2022)
Article Google Scholar
Nigam, S., Singh, R., Misra, A.K.: A review of computational approaches for human behavior detection. Arch. Comput. Meth. Eng. 26, 831–863 (2019)
Google Scholar
Guo, M.H., et al.: Attention mechanisms in computer vision: a survey. Comp. Visual Media 8(3), 331–368 (2022)
Article MathSciNet Google Scholar
Kukkala, V.K., Thiruloga, S.V., Pasricha, S.: Latte: LSTM self-attention based anomaly detection in embedded automotive platforms. ACM Trans. Embedded Comput. Syst. 20(5s), 1–23 (2021)
Article Google Scholar
Ma, H., Zhang, L.: Attention-based framework for weakly supervised video anomaly detection. J. Supercomput. 78(6), 8409–8429 (2022)
Google Scholar
Nasaruddin, N., Muchtar, K., Afdhal, A., Dwiyantoro, A.P.J.: Deep anomaly detection through visual attention in surveillance videos. J. Big Data 7(1), 1–17 (2020)
Article Google Scholar
Li, Q., Yang, R., Xiao, F., Bhanu, B., Zhang, F.: Attention-based anomaly detection in multi-view surveillance videos. Knowl.-Based Syst. 252, 109348 (2022)
Article Google Scholar
Du, Z., Zhang, G., Gan, J., Wu, C., Liu, X.: VadTR: video anomaly detection with transformer. In: 2022 5th International Conference on Advanced Electronic Materials, Computers and Software Engineering (AEMCSE), pp. 711–714. IEEE (2022)
Google Scholar
Aslam, N.K., Narayanan, S., Kolekar, M.H.: Bidirectional motion learning using transformer based Siamese network for video anomaly detection (2023)
Google Scholar
Pang, W., He, Q., Li, Y.: Predicting skeleton trajectories using a skeleton-transformer for video anomaly detection. Multimedia Syst. 28(4), 1481–1494 (2022)
Article Google Scholar
Monitoring Human Activity – Detection of Events. http://mha.cs.umn.edu/proj_events.shtml#crowd. Last accessed 15 Aug 2023
Avenue Dataset for Abnormal Event Detection. http://www.cse.cuhk.edu.hk/leojia/projects/detectabnormal/dataset.html last accessed 2023/08/14
Bianculli, M., et al.: A dataset for automatic violence detection in videos. Data in Brief 33, 106587 (2020)
Google Scholar
Ullah, F.U.M., et al.: AI-assisted edge vision for violence detection in IoT-based industrial surveillance networks. IEEE Trans. Industr. Inf. 18(8), 5359–5370 (2021)
Article Google Scholar
Woo, S., Park, J., Lee, J.Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
Google Scholar
Liu, Z., et al.: Swin transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10012–10022 (2021)
Google Scholar
Nigam, S., Singh, R., Singh, M.K., Singh, V.K.: Multiview human activity recognition using uniform rotation invariant local binary patterns. J. Ambient. Intell. Humaniz. Comput. 14(5), 4705–4725 (2022)
Google Scholar
Ghadi, Y.Y., et al.: Extrinsic behaviour prediction of pedestrian via maximum entropy Markov model and graph-based features mining. Appl. Sci. 12(12), 5985 (2022)
Google Scholar
Alarfaj, M., et al.: Automatic anomaly monitoring in public surveillance areas. Intell. Autom. Soft Comput. 35(3), 2655–2671 (2023)
Google Scholar
Ullah, W., Ullah, A., Hussain, T., Khan, Z.A., Baik, S.W.: An efficient anomaly recognition framework using an attention residual LSTM in surveillance videos. Sensors 21(8), 2811 (2021)
Article Google Scholar
Ilyas, Z., Aziz, Z., Qasim, T., Bhatti, N., Hayat, M.F.: A hybrid deep network based approach for crowd anomaly detection. Multimed. Tools Appl. 80, 24053–24067 (2021)
Article Google Scholar
Abdullah, F., Jalal, A.: Semantic segmentation based crowd tracking and anomaly detection via neuro-fuzzy classifier in smart surveillance system. Arab. J. Sci. Eng. 48(2), 2173–2190 (2023)
Article Google Scholar
Aziz, Z., Bhatti, N., Mahmood, H., Zia, M.: Video anomaly detection and localization based on appearance and motion models. Multimed. Tools Appl. 80(17), 25875–25895 (2021)
Article Google Scholar
Sharif, M.H., Jiao, L., Omlin, C.W.: Deep crowd anomaly detection by fusing reconstruction and prediction networks. Electronics 12(7), 1517 (2023)
Article Google Scholar
Deepak, K., Chandrakala, S., Mohan, C.K.: Residual spatiotemporal autoencoder for unsupervised video anomaly detection. SIViP 15(1), 215–222 (2021)
Article Google Scholar
Khaire, P., Kumar, P.: A semi-supervised deep learning based video anomaly detection framework using RGB-D for surveillance of real-world critical environments. Forensic Sci. Int. Digit. Investig. 40, 301346 (2022)
Article Google Scholar
Ehsan, T.Z., Nahvi, M., Mohtavipour, S.M.: An accurate violence detection framework using unsupervised spatial-temporal action translation network. Vis. Comput. 1–21 (2023). https://doi.org/10.1007/s00371-023-02865-3
Yuan, H., Cai, Z., Zhou, H., Wang, Y., Chen, X.: Transanomaly: video anomaly detection using video vision transformer. IEEE Access 9, 123977–123986 (2021)
Article Google Scholar
Yang, M., et al.: Transformer-based deep learning model and video dataset for unsafe action identification in construction projects. Autom. Constr. 146, 104703 (2023)
Article Google Scholar
Lee, Y., Kang, P.: AnoViT: unsupervised anomaly detection and localization with vision transformer-based encoder-decoder. IEEE Access 10, 46717–46724 (2022)
Article Google Scholar
Ullah, W., Hussain, T., Ullah, F.U.M., Lee, M.Y., Baik, S.W.: TransCNN: hybrid CNN and transformer mechanism for surveillance anomaly detection. Eng. Appl. Artif. Intell. 123, 106173 (2023)
Article Google Scholar
Pillai, A., Verma, G.V., Sen, D.: Transformer based self-context aware prediction for few-shot anomaly detection in videos. In: 2022 IEEE International Conference on Image Processing (ICIP), pp. 3485–3489. IEEE (2022)
Google Scholar
Sivalingan, H., Anandakrishnan, N.: Crowd localization and anomaly detection using video anomaly scoring network. Math. Stat. Eng. Appl. 72(1), 825–837 (2023)
Google Scholar
Sernani, P., Falcionelli, N., Tomassini, S., Contardo, P., Dragoni, A.F.: Deep learning for automatic violence detection: Tests on the AIRTLab dataset. IEEE Access 9, 160580–160595 (2021)
Article Google Scholar
Kumar, A., Khari, M.: Efficient video anomaly detection using variational autoencoder. In: 2023 International Conference on Communication System, Computing and IT Applications (CSCITA), pp. 50–55. IEEE (2023)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Physical Science, Banasthali Vidyapith, Rajasthan, 304022, India
Aishvarya Garg
Department of Computer Science, Banasthali Vidyapith, Rajasthan, 304022, India
Swati Nigam & Rajiv Singh
Centre for Artificial Intelligence, Banasthali Vidyapith, Rajasthan, 304022, India
Aishvarya Garg, Swati Nigam, Rajiv Singh & Anshuman Shastri
School of Engineering, Science and Management, Oregon Institute of Technology, Oregon, 97601, USA
Madhusudan Singh

Authors

Aishvarya Garg
View author publications
You can also search for this author in PubMed Google Scholar
Swati Nigam
View author publications
You can also search for this author in PubMed Google Scholar
Rajiv Singh
View author publications
You can also search for this author in PubMed Google Scholar
Anshuman Shastri
View author publications
You can also search for this author in PubMed Google Scholar
Madhusudan Singh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rajiv Singh .

Editor information

Editors and Affiliations

Soongsil University, Seoul, Korea (Republic of)
Bong Jun Choi
Saint Louis University, St. Louis, MO, USA
Dhananjay Singh
Indian Institute of Information Technology, Allahabad, India
Uma Shanker Tiwary
Pukyong National University, Busan, Korea (Republic of)
Wan-Young Chung

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Garg, A., Nigam, S., Singh, R., Shastri, A., Singh, M. (2024). Spatial Attention Transformer Based Framework for Anomaly Classification in Image Sequences. In: Choi, B.J., Singh, D., Tiwary, U.S., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2023. Lecture Notes in Computer Science, vol 14532. Springer, Cham. https://doi.org/10.1007/978-3-031-53830-8_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-53830-8_6
Published: 29 February 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53829-2
Online ISBN: 978-3-031-53830-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics