Skip to main content
Log in

Video structural description technology for the new generation video surveillance systems

  • Research Article
  • Published:
Frontiers of Computer Science Aims and scope Submit manuscript

Abstract

The increasing need of video based applications issues the importance of parsing and organizing the content in videos. However, the accurate understanding and managing video contents at the semantic level is still insufficient. The semantic gap between low level features and high level semantics cannot be bridged by manual or semi-automatic methods. In this paper, a semantic based model named video structural description (VSD) for representing and organizing the content in videos is proposed. Video structural description aims at parsing video content into the text information, which uses spatiotemporal segmentation, feature selection, object recognition, and semantic web technology. The proposed model uses the predefined ontologies including concepts and their semantic relations to represent the contents in videos. The defined ontologies can be used to retrieve and organize videos unambiguously. In addition, besides the defined ontologies, the semantic relations between the videos are mined. The video resources are linked and organized by their related semantic relations.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Xu Z, Liu Y, Mei L, Hu C, Chen L. Semantic based representing and organizing surveillance big data using video structural description technology. Journal of Systems and Software, 2015, 102: 217–225

    Article  Google Scholar 

  2. Hu C, Xu Z, Liu Y, Mei L, Chen L, Luo X. Semantic link networkbased model for organizing multimedia big data. IEEE Transactions on Emerging Topics in Computing, 2014, 2(3): 376–387

    Article  Google Scholar 

  3. Wu L, Wang Y. The process of criminal investigation based on grey hazy set. In: Proceedings of IEEE International Conference on System Man and Cybernetics. 2010, 26–28

    Google Scholar 

  4. Liu L, Li Z, Delp E J. Efficient and low-complexity surveillance video compression using backward-channel aware Wyner-Ziv video coding. IEEE Transactions on Circuits and Systems for Video Technology, 2009, 19(4): 452–465

    Google Scholar 

  5. Zhang J, Zulkernine M, Haque A. Random-forests-based network intrusion detection systems. IEEE Transactions on Systems, Man, and Cybernetics (Part C: Applications and Reviews), 2008, 38(5): 649–659

    Article  Google Scholar 

  6. Yu H Q, Pedrinaci C, Dietze S, Domingue J. Using linked data to annotate and search educational video resources for supporting distance learning. IEEE Transactions on Learning Technologies, 2012, 5(2): 130–142

    Article  Google Scholar 

  7. Xu C, Zhang Y F, Zhu G, Rui Y, Lu H, Huang Q. Using webcast text for semantic event detection in broadcast sports video. IEEE Transactions on Multimedia, 2008, 10(7): 1342–1355

    Article  Google Scholar 

  8. Berners-Lee T, Hendler J, Lassila O. The semantic web. Scientific American, 2001, 284(5): 34–43

    Article  Google Scholar 

  9. Ma H, Zhu J, Lyu M R T, King I. Bridging the semantic gap between image contents and tags. IEEE Transactions on Multimedia, 2010, 12(5): 462–473

    Article  Google Scholar 

  10. Chen H T, Ahuja N. Exploiting nonlocal spatiotemporal structure for video segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2012, 741–748

    Google Scholar 

  11. Javed K, Babri H, Saeed M. Feature selection based on class-dependent densities for high-dimensional binary data. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(3): 465–477

    Article  Google Scholar 

  12. Choi M, Torralba A, Willsky A. A Tree-based context model for object recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(2): 240–252

    Article  Google Scholar 

  13. Luo X, Xu Z, Yu J, Chen X. Building association link network for semantic link on web resources. IEEE transactions on automation science and engineering, 2011, 8(3): 482–494

    Article  Google Scholar 

  14. Xu Z, Luo X, Wang L. Incremental building association link network. Computer Systems Science and Engineering, 2011, 26(3): 153–162

    MathSciNet  Google Scholar 

  15. Liu Y, Zhu Y, Ni L M, Xue G. A reliability-oriented transmission service in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 2011, 22(12): 2100–2107

    Article  Google Scholar 

  16. Liu Y, Zhang Q, Ni L M. Opportunity-based topology control in wireless sensor networks. IEEE Transactions on Parallel and Distributed Systems, 2010, 21(3): 405–416

    Article  Google Scholar 

  17. Donderler M, Saykol E, Arslan U, Ulusoy O, Gudukbay U. Bilvideo: design and implementation of a video database management system. Multimedia Tools Applications, 2005, 27(1): 79–104

    Article  Google Scholar 

  18. Sevilmis T, Bastan M, Gudukbay U, Ulusoy O. Automatic detection of salient objects and spatial relations in videos for a video database system. Image Vision Computing, 2008, 26(10): 1384–1396

    Article  Google Scholar 

  19. Fan J, Aref W G, Elmagarmid A K, Hacid M S, Marzouk M S, Zhu X. Multiview: multilevel video content representation and retrieval. Journal of Electronic Imaging, 2001, 10(4): 895–908

    Article  Google Scholar 

  20. Fan J, Elmagarmid A K, Zhu X, Aref W G, Wu L. Classview: hierarchical video shot classification, indexing, and accessing. IEEE Transactions on Multimedia, 2004, 6(1): 70–86

    Article  Google Scholar 

  21. Bai L, Lao S, Jones G J, Smeaton A F. Video semantic content analysis based on ontology. In: Proceedings of the 11th International Machine Vision and Image Processing Conference. 2007, 117–124

    Google Scholar 

  22. Nevatia R, Natarajan P. EDF: a framework for semantic annotation of video. In: Proceedings of the 10th IEEE International Conference on Computer Vision Workshops. 2005, 1876

    Google Scholar 

  23. Bagdanov A D, Bertini M, Del Bimbo A, Torniai C, Serra G. Semantic annotation and retrieval of video events using multimedia ontologies. In: Proceedings of IEEE International Conference on Semantic Computing. 2007, 713–720

    Google Scholar 

  24. Francois A R, Nevatia R, Hobbs J, Bolles R, Smith J R. VERL: an ontology framework for representing and annotating video events. IEEE Multimedia, 2005, 12(4): 76–86

    Article  Google Scholar 

  25. Akdemir U, Turaga P, Chellappa R. An ontology based approach for activity recognition from video. In: Proceedings of the ACM International Conference on Multimedia. 2008, 709–712

    Chapter  Google Scholar 

  26. Marszalek M, Schmid C. Semantic hierarchies for visual object recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2007, 1–7

    Google Scholar 

  27. Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L. Imagenet: a largescale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2009, 248–255

    Google Scholar 

  28. Yao B, Yang X, Lin L, Lee M W, Zhu S C. I2t: image parsing to text description. Proceedings of the IEEE, 2010, 98(8): 1485–1508

    Article  Google Scholar 

  29. Felzenszwalb P, McAllester D, Ramanan D. A discriminatively trained, multiscale, deformable part model. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2008, 1–8

    Google Scholar 

  30. Felzenszwalb P, Girshick R, McAllester D, Ramanan D. Object detection with discriminatively trained part based models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9): 1627–1645

    Article  Google Scholar 

  31. Felzenszwalb P F, Girshick R B, McAllester D. Cascade object detection with deformable part models. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 2010, 2241–2248

    Google Scholar 

  32. Chen N, Zhou Q Y, and Prasanna V. Understanding web image by object relation network. In: Proceedings of the 21st International Conference on World Wide Web. 2012, 291–300

    Chapter  Google Scholar 

  33. Kulkarni G, Premraj V, Dhar S, Li S, Choi Y, Berg A C, Berg T L. Baby talk: understanding and generating image descriptions. In: Proceedings of the 24th IEEE Conference on Computer Vision and Pattern Recognition. 2011

    Google Scholar 

  34. Qi G J, Aggarwal C, Huang T. Towards semantic knowledge propagation from text corpus to web images. In: Proceedings of the 20th International Conference on World Wide Web. 2011, 297–306

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zheng Xu.

Additional information

Chuanping Hu received his PhD from Tongji University, China. He is the Dean Professor of the Third Research Institute of the Ministry of Public Security, China. He is also the founder of video structural description technology.

Zheng Xu received his PhD from the School of Computing Engineering and Science, Shanghai University, China in 2007 and 2012, respectively. He is currently working in the Third Research Institute of the Ministry of Public Security and working for his postdoctoral in Tsinghua University, China. His current research interests include intelligent surveillance systems, big data, and crowdsourcing.

Yunhuai Liu is a professor in the Third Research Institute ofMinistry of Public Security, China. He received his PhD from Hong Kong University of Science and Technology (HKUST), China in 2008. His main research interests include wireless sensor networks, pervasive computing, and wireless network.

Lin Mei received his PhD from Xi’an Jiaotong University, China. He is currently working in the Third Research Institute of theMinistry of Public Security, China. He is the Dean Professor of the Department of Internet of Things.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hu, C., Xu, Z., Liu, Y. et al. Video structural description technology for the new generation video surveillance systems. Front. Comput. Sci. 9, 980–989 (2015). https://doi.org/10.1007/s11704-015-3482-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11704-015-3482-x

Keywords

Navigation