Skip to main content

Semantic Marking Method of Video Scene Based on 3D Convolutional Neural Network

  • Conference paper
  • First Online:
Data Processing Techniques and Applications for Cyber-Physical Systems (DPTA 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1088))

  • 54 Accesses

Abstract

With the rapid development of multimedia technology, video data has become an important source big data. How to identify and segment different semantic objects in video scenes accurately has become a research hotspot in the field of computer vision. On the basis of studying the traditional video scene semantic segmentation method, this paper proposes a semantic marking method of video scene based on 3D convolutional neural network. This method uses 3D convolutional neural network to learn the high-level discriminant features of objects of different semantic categories to improve the accuracy of semantic segmentation. The experimental results show that the proposed method is comparable to the classical method in segmentation accuracy and can overcome the drag effect caused by the false optical flow field relations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Haseyama, M., T. Ogawa, and N. Yagi. 2013. A Review of video retrieval based on image and video semantic understanding. ITE Transactions on Media Technology and Applications 1 (1): 2–9.

    Article  Google Scholar 

  2. Merler, M., B. Huang, L. Xie, et al. 2012. Semantic model vectors for complex video event recognition. Multimedia, IEEE Transactions 14 (1): 88–101.

    Article  Google Scholar 

  3. Assari, S., A. Zamir, and M. Shah. 2014. Video classification using semantic concept co-occurrences. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2529–2536.

    Google Scholar 

  4. Deng, L., and D. Yu. 2013. Deep learning for signal and information processing. Microsoft research monograph.

    Google Scholar 

  5. Krizhevsky, A., I. Sutskever, and G.E. Hinton. 2012. ImageNet classification with deep convolutional neural networks. Proceedings of the Advances in Neural Information Processing Systems 1 (2): 4.

    Google Scholar 

  6. Ng, Andrew, Jiquan N., Chuan Y., Foo, et al. 12 December 2013. UFLDL Tutorial. Ufldl at Stanford University. Web.

    Google Scholar 

  7. Xu, C., and J.J. Corso. 2012. Evaluation of supervoxel methods for early video processing. In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference, pp. 1202–1209.

    Google Scholar 

Download references

Acknowledgements

This work is supported by the National Key R&D Program of China (No. 2017YFC0803700).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wu Jinbo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Jinbo, W. (2020). Semantic Marking Method of Video Scene Based on 3D Convolutional Neural Network. In: Huang, C., Chan, YW., Yen, N. (eds) Data Processing Techniques and Applications for Cyber-Physical Systems (DPTA 2019). Advances in Intelligent Systems and Computing, vol 1088. Springer, Singapore. https://doi.org/10.1007/978-981-15-1468-5_238

Download citation

Publish with us

Policies and ethics