Abstract
With the rapid development of multimedia technology, video data has become an important source big data. How to identify and segment different semantic objects in video scenes accurately has become a research hotspot in the field of computer vision. On the basis of studying the traditional video scene semantic segmentation method, this paper proposes a semantic marking method of video scene based on 3D convolutional neural network. This method uses 3D convolutional neural network to learn the high-level discriminant features of objects of different semantic categories to improve the accuracy of semantic segmentation. The experimental results show that the proposed method is comparable to the classical method in segmentation accuracy and can overcome the drag effect caused by the false optical flow field relations.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Haseyama, M., T. Ogawa, and N. Yagi. 2013. A Review of video retrieval based on image and video semantic understanding. ITE Transactions on Media Technology and Applications 1 (1): 2–9.
Merler, M., B. Huang, L. Xie, et al. 2012. Semantic model vectors for complex video event recognition. Multimedia, IEEE Transactions 14 (1): 88–101.
Assari, S., A. Zamir, and M. Shah. 2014. Video classification using semantic concept co-occurrences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2529–2536.
Deng, L., and D. Yu. 2013. Deep learning for signal and information processing. Microsoft research monograph.
Krizhevsky, A., I. Sutskever, and G.E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 1 (2): 4.
Ng, Andrew, Jiquan N., Chuan Y., Foo, et al. 12 December 2013. UFLDL Tutorial. Ufldl at Stanford University. Web.
Xu, C., and J.J. Corso. 2012. Evaluation of supervoxel methods for early video processing. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 1202–1209.
Acknowledgements
This work is supported by the National Key R&D Program of China (No. 2017YFC0803700).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jinbo, W. (2020). Semantic Marking Method of Video Scene Based on 3D Convolutional Neural Network. In: Huang, C., Chan, YW., Yen, N. (eds) Data Processing Techniques and Applications for Cyber-Physical Systems (DPTA 2019). Advances in Intelligent Systems and Computing, vol 1088. Springer, Singapore. https://doi.org/10.1007/978-981-15-1468-5_238
Download citation
DOI: https://doi.org/10.1007/978-981-15-1468-5_238
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-1467-8
Online ISBN: 978-981-15-1468-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)