Enhanced 3D residual network for video event recognition in shipping monitoring

Zhang, Hong; Rong, Jiexiong

doi:10.1007/s11042-020-09564-4

Enhanced 3D residual network for video event recognition in shipping monitoring

Published: 20 September 2020

Volume 80, pages 3337–3348, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hong Zhang^1,2 &
Jiexiong Rong^1,2

294 Accesses
1 Citation
Explore all metrics

Abstract

The three-dimensional convolutional neural network is widely used in video recognition, action recognition and other tasks because it can directly extract temporal and spatial features. Due to the large number of parameters, many computing resources, and difficulty in training, the structure of three-dimensional convolutional neural network is generally shallow. For example, the traditional C3D [17] method uses only the 11-layer VGGNet structure, and the traditional Res3D [18] method adopts a residual network of 18 and 34 layers. Some experience of two-dimensional convolutional neural network shows that the deeper the network structure is, the higher the recognition accuracy will be. Therefore, this paper proposes a new method 3D ResNet-66, which combines a 50-layer 3D residual network and four-layer residual blocks, effectively reducing the number of parameters while increasing the depth of the network, and we finally obtain a better video recognition model through experiments. We evaluate our method on shipping event datasets. Compared to the traditional C3D and Res3D method, our method has improved the accuracy from 91.48% to 96.33%, the model size has been reduced from 561 MB to 135 MB, and the average processing time has become half of the original.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Methods for image denoising using convolutional neural network: a review

Article Open access 10 June 2021

Ademola E. Ilesanmi & Taiwo O. Ilesanmi

A review of object detection based on deep learning

Article 12 June 2020

Youzi Xiao, Zhiqiang Tian, … Xuguang Lan

Convolutional neural network: a review of models, methodologies and applications to object detection

Article 20 December 2019

Anamika Dhillon & Gyanendra K. Verma

References

Carreira J, Zisserman A (2017) Quo vadis, action recognition? A new model and the kinetics dataset. In: Proc IEEE Conf Comput Vis Pattern Recognit, 6299–6308
Feichtenhofer C, Pinz A, Zisserman A (2016) Convolutional two-stream network fusion for video action recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1933–1941
Hara K, Kataoka H, Satoh Y (2018) Can spatiotemporal 3d cnns retrace the history of 2d cnns and imagenet? In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 6546–6555
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
He K, Zhang X, Ren S, Sun J (2016) Identity mappings in deep residual networks. In: European conference on computer vision. Springer, Cham, pp 630–645
Huang G, Liu Z, Van Der Maaten L, Weinberger KQ (2017) Densely connected convolutional networks. In: Proc IEEE Conf Comput Vis Pattern Recognit, pp 4700–4708
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167
Ji S, Xu W, Yang M, Yu K (2012) 3D convolutional neural networks for human action recognition. IEEE Trans Pattern Anal Mach Intell 35(1):221–231
Kay W, Carreira J, Simonyan K, Zhang B, Hillier C, Vijayanarasimhan S, Suleyman M (2017) The kinetics human action video dataset. arXiv preprint arXiv:1705.06950
Nair V, Hinton GE (2010) Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th international conference on machine learning (ICML-10), pp 807–814
Qiu Z, Yao T, Mei T (2017) Learning spatio-temporal representation with pseudo-3d residual networks. In: Proc IEEE Inter Conf on Comput Vis, pp 5533–5541
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Simonyan K, Zisserman A (2014) Two-stream convolutional networks for action recognition in videos. In: Advances in neural information processing systems, pp 568–576
Szegedy C, Ioffe S, Vanhoucke V, Alemi AA (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. In: Thirty-First AAAI Conference on Artificial Intelligence
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Rabinovich A (2015) Going deeper with convolutions. In proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S, Shlens J, Wojna Z (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2818–2826
Tran D, Bourdev L, Fergus R, Torresani L, Paluri M (2015) Learning spatiotemporal features with 3d convolutional networks. In: Proceedings of the IEEE international conference on computer vision, pp 4489–4497
Tran D, Ray J, Shou Z, Chang SF, Paluri M (2017) Convnet architecture search for spatiotemporal feature learning. arXiv preprint arXiv:1708.05038
Wang W, Shen J, Lu X, Hoi SC, Ling H (2020) Paying attention to video object pattern understanding. IEEE Transactions on Pattern Analysis and Machine Intelligence
Zagoruyko S, Komodakis N (2016) Wide residual networks. arXiv preprint arXiv:1605.07146

Download references

Author information

Authors and Affiliations

College of Computer Science & Technology, Wuhan University of Science & Technology, Wuhan, 430081, China
Hong Zhang & Jiexiong Rong
Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System, Wuhan, China
Hong Zhang & Jiexiong Rong

Authors

Hong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Jiexiong Rong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hong Zhang.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, H., Rong, J. Enhanced 3D residual network for video event recognition in shipping monitoring. Multimed Tools Appl 80, 3337–3348 (2021). https://doi.org/10.1007/s11042-020-09564-4

Download citation

Received: 18 November 2019
Revised: 03 July 2020
Accepted: 06 August 2020
Published: 20 September 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s11042-020-09564-4

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Enhanced 3D residual network for video event recognition in shipping monitoring

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

A review of object detection based on deep learning

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Enhanced 3D residual network for video event recognition in shipping monitoring

Abstract

Access this article

Similar content being viewed by others

Methods for image denoising using convolutional neural network: a review

A review of object detection based on deep learning

Convolutional neural network: a review of models, methodologies and applications to object detection

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation