Skip to main content

Advertisement

Log in

A time sequence location method of long video violence based on improved C3D network

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

A Correction to this article was published on 12 July 2022

This article has been updated

Abstract

This paper mainly studies the retrieval and location of violence in long time sequence video. Aiming at the low accuracy of violence detection in long time sequence video, a two-stage violence time sequence location method based on DC3D network model is proposed in this paper. In the video preprocessing stage, this paper adopts the method of generating video at multiple scales. In the first stage, we first use a large number of labeled datasets for pre-training to obtain a C3D network model for generating candidate videos and finally filter out meaningless background videos. In the second stage, we use the deconvolution method to identify the candidate video accurately to the frame level, so as to determine the specific time of violence. In the first stage, this method can improve the overall accuracy by generating candidate videos. DC3D network model can accurately locate the time of violence to the level of frame. The experimental results show that the proposed method can quickly retrieve and locate the violent fighting behavior in the surveillance video under the long time sequence video. The research results of this paper can provide convenience for the surveillance personnel to quickly retrieve and locate the target segment in a large amount of video data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Data availability

The authors declare that all the experimental data in this paper are true, valid and available. The authors declare that all experimental data are obtained from detailed experiments.

Change history

References

  1. Chen X, Li M, Zhong H, Ma Y, Hsu C-H (2022) DNNOff: offloading DNN-based intelligent IoT applications in mobile edge computing. IEEE Transa Ind Inform 18(4):2820–2829

    Article  Google Scholar 

  2. Chen X, Zhang J, Lin B, Chen Z, Wolter K, Min G (2022) Energy-efficient offloading for DNN-based smart IoT systems in cloud-edge environments. IEEE Trans. Parallel Distrib. Syst. 33(3):683–697

    Article  Google Scholar 

  3. Chen X, Junqin H, Chen Z, Lin B, Xiong N, Min G (2022) A reinforcement learning empowered feedback control system for industrial internet of things. IEEE Trans Ind Inf 18(4):2724–2733

    Article  Google Scholar 

  4. Chen X, Yang L, Chen Z, Min G, Zheng X, Rong C Resource allocation with workload-time windows for cloud-based software services: a deep reinforcement learning approach. IEEE Trans Cloud Comput, doi: https://doi.org/10.1109/TCC.2022.3169157.

  5. Huang G, Chen X, Zhang Y, Zhang X (2012) Towards architecture-based management of platforms in the cloud. Front Comput Sci 6(4):388–397

    Article  MathSciNet  Google Scholar 

  6. Huang G, Luo C, Wu K, Ma Y, Zhang Y, Liu X (2019) Software-defined infrastructure for decentralized data lifecycle governance: principled design and open challenges. In: IEEE international conference on distributed computing systems

  7. Ye O, Huang P, Zhang Z, Zheng Y et al (2021) Multiview learning with robust double-sided twin SVM. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3088519

    Article  Google Scholar 

  8. Ye Q, Li Z, Fu L et al (2019) Nonpeaked discriminant analysis. IEEE Trans Neural Netw Learn Syst 30(12):3818–3832

    Article  MathSciNet  Google Scholar 

  9. Fu L, Li Z, Ye Q et al (2022) Learning robust discriminant subspace based on joint L2, p- and L2, s-norm distance metrics. IEEE Trans Neural Netw Learn Syst. 33(1):130–144. https://doi.org/10.1109/TNNLS.2020.3027588

    Article  MathSciNet  Google Scholar 

  10. Lan Z, Zhu Y, Hauptmann AG, et al (2017) Deep local video feature for action recognition[C]. In: 2017 IEEE conference computer vision and pattern recognition workshops (CVPRW), Honolulu, HI, pp 1219–1225

  11. Gaidon A, Harchaoui Z, Schmid C (2013) Temporal localization of actions with actoms. In: TPAMI

  12. Singh G, Cuzzolin F (2016) Untrimmed classification foractivity detection: submission to activitynet challenge. In: CVPR activity net workshop

  13. Wang L, Qiao Y, Tang X (2014) Action recognition and detection by combining motion and appearance features. In: ECCV THUMOS workshop

  14. Yeung S, Russakovsky O, Jin N, Andriluka M, Mori G, Fei-Fei L (2015) Every moment counts: Dense detailed labeling of actions in complex videos. ar Xiv preprint arXiv: 1507.05738.

  15. Shou Z, Wang D, Chang SF (2016) Temporal action localization in untrimmed videos via multi-stage CNNs[C]// computer vision and pattern recognition. IEEE, 1049–1058.

  16. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2016) Deep end2end voxel2voxel prediction. In: CVPR workshop on deep learning in computer vision

  17. Shou Z, Chan J, Zareian A, et al. (2017) CDC: convolutional-de-convolutional networks for precise temporal action localization in untrimmed videos[J]

  18. Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: ICCV

  19. Kim K, Chalidabhongse TH, Harwood D (2005) Real-time foreground-background segmentation using codebook model[J]. Real-Time Imag 11(3):172–185

    Article  Google Scholar 

  20. Blunsden SJ, Fisher RB (2010) The BEHAVE video dataset: ground truthed video for multi-person behavior classification. Annals of the BMVA 4:1–12

    Google Scholar 

  21. Huang K, Tan T (2010) Vs-star: a visual interpretation system for visual surveillance[J]. Pattern Recogn Lett 31(14):2265–2285

    Article  Google Scholar 

  22. Shah M, Javed O, Shafique K (2007) Automated visual surveillance in realistic scenarios[M]. IEEE Computer Society Press, New Jersey

    Book  Google Scholar 

  23. Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition[J]. Comput Sci

  24. Annane D, Chevrolet JC, Chevret S, Raphael JC (2014) Two-stream convolutional networks for action recognition in videos. Adv Neural Inf Process Syst 1(4):568–576

    Google Scholar 

  25. Grabner H, Bischof H (2006) On-line boosting and vision[C]// computer vision and pattern recognition, In: 2006 IEEE computer society conference on IEEE, pp 260–267.

  26. Dong Z, Qin J, Wang Y (2016) Multi-stream deep networks for person to person violence detection in videos. In: Tan T, Li X, Chen X, Zhou J, Yang J, Cheng H (eds) CCPR 2016. CCIS, vol 662. Springer, Singapore, pp 517–531. https://doi.org/10.1007/978-981-10-3002-443

    Chapter  Google Scholar 

  27. Richard A, Gall J (2016) Temporal action detection using astatistical language model. In: CVPR

  28. Donahue J, Hendricks LA, Guadarrama S, Rohrbach M, Venugopalan S, Saenko K, Darrell T (2015) Long-term recurrent convolutional networks for visual recognition and description. In: CVPR

  29. Karaman S, Seidenari L, Bimbo AD (2014) Fast saliency based pooling of fisher encoded dense trajectories. In: ECCV THUMOS workshop.

  30. Yeung S, Russakovsky O, Mori G, Fei-Fei L (2016) End-to-end learning of action detection from frame glimpses in videos. In: CVPR

Download references

Funding

This paper is supported by Special Fund for Science and Technology Innovation Strategy of Guangdong Province in 2021 (Special Fund for Climbing Plan) (pdjh2021a0944), Special Projects in Key Fields of Colleges and Universities in Guangdong Province in 2021(2021ZDZX1093), Dongguan Science and Technology of Social Development Program in 2021 (20211800900252), Special fund for electronic information engineering technology specialty group of national double high program of Dongguan Polytechnic in 2021 (ZXYYD001), Special fund for electronic information engineering technology specialty group of national double high program of Dongguan Polytechnic in 2021 (ZXF002), Special fund for electronic information engineering technology specialty group of national double high program of Dongguan Polytechnic in 2022 (ZXB202203), Special fund for electronic information engineering technology specialty group of national double high program of Dongguan Polytechnic in 2022 (ZXC202201) and Special fund for electronic information engineering technology specialty group of national double high program of Dongguan Polytechnic in 2022 (ZXD202204).

Author information

Authors and Affiliations

Authors

Contributions

WQ was involved in the conceptualization, methodology, validation, investigation, writing and funding acquisition. TZ contributed to the formal analysis, software, resources and visualization. JL contributed to the software, resources, methodology, validation, writing—review and editing, supervision and funding acquisition. JL helped in the data curation, formal analysis, validation,writing and funding acquisition.

Corresponding author

Correspondence to Ting Zhu.

Ethics declarations

Conflict of interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Informed consent

Any participants (or their guardians if unable to give informed consent, or next of kin, if deceased) who may be identifiable through the manuscript (such as a case report) have been given an opportunity to review the final manuscript and have provided written consent to publish.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: In this article the affiliation details for author Ting Zhu were incorrectly assigned.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Qu, W., Zhu, T., Liu, J. et al. A time sequence location method of long video violence based on improved C3D network. J Supercomput 78, 19545–19565 (2022). https://doi.org/10.1007/s11227-022-04649-3

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-022-04649-3

Keywords

Navigation