Skip to main content
Log in

Multisource surveillance video data coding with hierarchical knowledge library

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

The rapidly increasing surveillance video data has challenged the existing video coding standards. Even though knowledge based video coding scheme has been proposed to remove redundancy of moving objects across multiple videos and achieved great coding efficiency improvement, it still has difficulties to cope with complicated visual changes of objects resulting from various factors. In this paper, a novel hierarchical knowledge extraction method is proposed. Common knowledge on three coarse-to-fine levels, namely category level, object level and video level, are extracted from history data to model the initial appearance, stable changes and temporal changes respectively for better object representation and redundancy removal. In addition, we apply the extracted hierarchical knowledge to surveillance video coding tasks and establish a hybrid prediction based coding framework. On the one hand, hierarchical knowledge is projected to the image plane to generate reference for I frames to achieve better prediction performance. On the other hand, we develop a transform based prediction for P/B frames to reduce the computational complexity while improve the coding efficiency. Experimental results demonstrate the effectiveness of our proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

References

  1. Au, O., Li, S., Zou, R., Dai, W., & Sun, L. (2012). Digital photo album compression based on global motion compensation and intra/inter prediction. In Audio, Language and Image Processing (ICALIP), 2012 International Conference on, IEEE, pp. 84-90

  2. Azizpour, H., & Laptev, I. (2012). Object detection using strongly-supervised deformable part models. In European Conference on Computer Vision, Springer, pp. 836-849

  3. Bell S, Bala K, Snavely N (2014) Intrinsic images in the wild. ACM Trans Graph 33(4):159

    Article  Google Scholar 

  4. Bjontegarrd, G. (2001). Calculation of average PSNR differences between RD-curves. VCEG-M33

  5. Chen, C., Cai, J., Lin, W., & Shi, G. (2012). Surveillance video coding via low-rank and sparse decomposition. In Proceedings of the 20th ACM international conference on Multimedia, ACM, pp. 713-716

  6. Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. IEEE Trans Pattern Anal Mach Intell 25(5):564–577

    Article  Google Scholar 

  7. Guo, X., Li, S., & Cao, X. (2013). Motion matters: A novel framework for compressing surveillance videos. In Proceedings of the 21st ACM international conference on Multimedia, ACM, pp. 549-552

  8. Hakeem, A., Shafique, K., & Shah, M. (2005). An object-based video coding framework for video sequences obtained from static cameras. In Proceedings of the 13th annual ACM international conference on Multimedia, ACM, pp. 608-617

  9. HM 16.20. https://hevc.hhi.fraunhofer.de. Accessed 14 Sept 2018

  10. Kolmogorov V, Zabin R (2004) What energy functions can be minimized via graph cuts. IEEE Trans Pattern Anal Mach Intell 26(2):147–159

    Article  Google Scholar 

  11. Lin C, Zhao Y, Xiao J, Tillo T (2018) Region-based multiple description coding for multiview video plus depth video. IEEE Trans Multimedia 20(5):1209–1223

    Article  Google Scholar 

  12. Liu, Y., Nie, L., Han, L., Zhang, L., & Rosenblum, D. S. (2015). Action2Activity: Recognizing Complex Activities from Sensor Data. In IJCAI, pp. 1617-1623

  13. Liu Y, Nie L, Liu L, Rosenblum DS (2016) From action to activity: sensor-based activity recognition. Neurocomputing 181:108–115

    Article  Google Scholar 

  14. Liu, L., Cheng, L., Liu, Y., Jia, Y., & Rosenblum, D. S. (2016). Recognizing Complex Activities by a Probabilistic Interval-Based Model. In AAAI, pp. 1266-1272

  15. Liu, Y., Zhang, L., Nie, L., Yan, Y., & Rosenblum, D. S. (2016). Fortune Teller: Predicting Your Career Path. In AAAI, pp. 201-207

  16. Ma, C., Liu, D., Peng, X., & Wu, F. (2017). Surveillance video coding with vehicle library. In Image Processing (ICIP), 2017 IEEE International Conference on, IEEE, pp. 270-274

  17. Ng KT, Wu Q, Chan SC, Shum HY (2010) Object-based coding for plenoptic videos. IEEE Trans Circuits Syst Video Technol 20(4):548–562

    Article  Google Scholar 

  18. Paul M (2018) Efficient Multiview Video Coding Using 3-D Coding and Saliency-Based Bit Allocation. IEEE Trans Broadcast 64(2):235–246

    Article  Google Scholar 

  19. Purica AI, Mora EG, Pesquet-Popescu B, Cagnazzo M, Ionescu B (2016) Multiview plus depth video coding with temporal prediction view synthesis. IEEE Trans Circuits Syst Video Technol 26(2):360–374

    Article  Google Scholar 

  20. Shao Z, Cai J, Wang Z (2018) Smart Monitoring Cameras Driven Intelligent Processing to Big Surveillance Video Data. IEEE Transactions on Big Data 4(1):105–116

    Article  Google Scholar 

  21. Shi, Z., Sun, X., & Wu, F. (2013). Feature-based image set compression. In Multimedia and Expo (ICME), 2013 IEEE International Conference on, IEEE, pp. 1-6

  22. Sreedhar, K. K., Aminlou, A., Hannuksela, M. M., & Gabbouj, M. (2016). Standard-compliant multiview video coding and streaming for virtual reality applications. In Multimedia (ISM), 2016 IEEE International Symposium on, IEEE, pp. 295-300

  23. Sullivan GJ, Ohm J, Han WJ, Wiegand T (2012) Overview of the high efficiency video coding (HEVC) standard. IEEE Trans circuits syst video technol 22(12):1649–1668

    Article  Google Scholar 

  24. Tan TN, Sullivan GD, Baker KD (1998) Model-based localisation and recognition of road vehicles. Int J Comput Vis 27(1):5–25

    Article  Google Scholar 

  25. Tech G, Chen Y, Müller K, Ohm JR, Vetro A, Wang YK (2016) Overview of the multiview and 3D extensions of high efficiency video coding. IEEE Trans Circuits Syst Video Technol 26(1):35–49

    Article  Google Scholar 

  26. Tsai TH, Lin CY (2012) Exploring contextual redundancy in improving object-based video coding for video sensor networks surveillance. IEEE Trans Multimedia 14(3):669–682

    Article  Google Scholar 

  27. Vetro A, Wiegand T, Sullivan GJ (2011) Overview of the stereo and multiview video coding extensions of the H. 264/MPEG-4 AVC standard. Proc IEEE 99(4):626–642

    Article  Google Scholar 

  28. Waechter, M., Moehrle, N., & Goesele, M. (2014). Let there be color! Large-scale texturing of 3D reconstructions. In European Conference on Computer Vision, Springer, pp. 836-850

  29. Wang, Q., Wang, Z., Xiao, J., Xiao, J., & Li, W. (2016). Fine-Grained Vehicle Recognition in Traffic Surveillance. In Pacific Rim Conference on Multimedia, Springer, pp. 285-295

  30. Wang H, Tian T, Ma M, Wu J (2017) Joint Compression of Near-Duplicate Videos. IEEE Trans Multimedia 19(5):908–920

    Article  Google Scholar 

  31. Weinzaepfel, P., Jégou, H., & Pérez, P. (2011). Reconstructing an image from its local descriptors. In Computer Vision and Pattern Recognition (CVPR), 2011 IEEE Conference on, IEEE, pp. 337-344

  32. Wiegand T, Sullivan GJ, Bjontegaard G, Luthra A (2003) Overview of the H. 264/AVC video coding standard. IEEE Trans circuits syst video technol 13(7):560–576

    Article  Google Scholar 

  33. Wu H, Sun X, Yang J, Zeng W, Wu F (2016) Lossless compression of JPEG coded photo collections. IEEE Trans Image Process 25(6):2684–2696

    Article  MathSciNet  MATH  Google Scholar 

  34. Xiao J, Hu R, Liao L, Chen Y, Wang Z, Xiong Z (2016) Knowledge-based coding of objects for multisource surveillance video data. IEEE Trans Multimedia 18(9):1691–1706

    Article  Google Scholar 

  35. Yang, Y., Li, B., Li, P., & Liu, Q. (2018). A Two-Stage Clustering Based 3D Visual Saliency Model for Dynamic Scenarios. IEEE Transactions on Multimedia

  36. Yang Y, Liu Q, He X, Liu Z (2019) Cross-View Multi-Lateral Filter for Compressed Multi-View Depth Video. IEEE Trans Image Process 28(1):302–315

    Article  MathSciNet  MATH  Google Scholar 

  37. Yue H, Sun X, Yang J, Wu F (2013) Cloud-based image coding for mobile devices—Toward thousands to one compression. IEEE Trans Multimedia 15(4):845–857

    Article  Google Scholar 

  38. Zhang X, Tian Y, Huang T, Dong S, Gao W (2014) Optimizing the hierarchical prediction and coding in HEVC for surveillance and conference videos with background modeling. IEEE Trans Image Process 23(10):4511–4526

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Nature Science Foundation of China under Grant 61502348, 61671336, 91738302, by the Natural Science Foundation of Jiangsu Province under Grant BK20180234, by the Open Research Fund of State Key Laboratory of Information Engineering in Sureying, Mapping and Remote Sensing, Wuhan University under Grant 17E03, by the National Key R&D Program of China under Grant 2018YFB1201602.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruimin Hu.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, Y., Hu, R., Xiao, J. et al. Multisource surveillance video data coding with hierarchical knowledge library. Multimed Tools Appl 78, 14705–14731 (2019). https://doi.org/10.1007/s11042-018-6825-4

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-018-6825-4

Keywords

Navigation