Skip to main content
Log in

Integrating semantic analysis and scalable video coding for efficient content-based adaptation

  • REGULAR PAPER
  • Published:
Multimedia Systems Aims and scope Submit manuscript

Abstract

Scalable video coding has become a key technology to deploy systems where the adaptation of content to diverse constrained usage environments (such as PDAs, mobile phones and networks) is carried out in a simple and efficient way. Content-based adaptation and summarization are fields that aim for providing improved adaptation to the user, trying to optimize the semantic coverage in the adapted/summarized version. This paper proposes the integration of content analysis with scalable video adaptation paradigm. They must be fitted in such a way that the efficiency of scalable adaptation is not damaged. An integrated framework is proposed for semantic video adaptation, as well as an adaptive skimming scheme that can use the results of semantic analysis. They are described using the MPEG-21 DIA tools to provide the adaptation in a standard framework. Particularly, the case of activity analysis is described to illustrate the integration of semantic analysis in the framework, and its use for online content summarization and adaptation. Overall efficiency is achieved by means of computing activity using compressed domain analysis with several metrics evaluated as measures of activity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Chang S.-F., Vetro A. (2005). Video adaptation: concepts, technologies and open issues. Proc. IEEE 93(1): 148–158

    Article  Google Scholar 

  2. Vetro A. (2004). MPEG-21 digital item adaptation: enabling universal multimedia access. IEEE Multimed. 11(1): 84–87

    Article  Google Scholar 

  3. Ohm J.R. (2005). Advances in scalable video coding. Proc. IEEE 93(1): 42–56

    Article  Google Scholar 

  4. Ohm J.R., Woods J.W., Schaar M. (2004). Interframe wavelet coding motion picture representation for universal scalability. Signal Process. Image Commun. 19(9): 877–908

    Article  Google Scholar 

  5. Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable H.264/MPEG4-AVC extension. In: Proceedings of International Conference on Image Processing, (2006)

  6. Pereira F., Van Beek P., Kot A.C., Ostermann J. (2005). Special issue on analysis and understanding for video adaptation. IEEE Trans. Circuits Syst. Video Technol. 15(10): 1197–1199

    Article  Google Scholar 

  7. Dimitrova N., Zhang H.-J., Shahraray B., Sezan I., Huang T., Zakhor A. (2002). Applications of video-content analysis and retrieval. Multimed. IEEE 9(3): 42–55

    Article  Google Scholar 

  8. Furini, M., Ghini, V.: A video frame dropping mechanism based on audio perception. IEEE Global Telecommunications Conference Workshops, pp. 211–216 (2004)

  9. Yeung M.M., Yeo B.-L. (1997). Video visualization for compact presentation and fast browsing of pictorial content. Circuits Syst. Video Technol. IEEE Trans. 7(5): 771–785

    Article  Google Scholar 

  10. Chang H.S., Sull S., Lee S.U. (1999). Efficient video indexing scheme for content-based retrieval. Circuits Syst. Video Technol. IEEE Trans. 9(8): 1269–1279

    Article  Google Scholar 

  11. Pfeiffer S., Lienhart R., Fischer S., Effelsberg W. (1996). Abstracting digital movies automatically. J. Vis. Commun. Image Represent 7(4): 345–353

    Article  Google Scholar 

  12. Zhu X., Elmagarmid A.K., Xue X., Wu L., Catlin A.C. (2005). InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval. Multimed. IEEE Trans. 7(4): 648–666

    Article  Google Scholar 

  13. Peker, K.A., Divakaran, A., Sun, H.: Constant pace skimming and temporal sub-sampling of video using motion activity. In: Proceedings of International Conference on. Image Processing, pp. 414–417 (2001)

  14. Ma Y.-F., Hua X.-S., Lu L., Zhang H.-J. (2005). A generic framework of user attention model and its application in video summarization. Multimed. IEEE Trans. 7(5): 907–919

    Article  Google Scholar 

  15. Li Z., Schuster G.M., Katsaggelos A.K., Gandhi B. (2005). Rate- distortion optimal video summary generation. IEEE Trans. Image Process. 14(10): 1550–1560

    Article  Google Scholar 

  16. Ngo, C.-W., Ma, Y.-F., Zhang, H.-J.: Automatic video summarization by graph modeling. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 104–109 (2003)

  17. Smith, M.A., Kanade, T.: Video skimming and characterization through the combination of image and language understanding. In: Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 61–70 (1998)

  18. Gang, Z., Chia, L.T., Zongkai, Y.: MPEG-21 digital item adaptation by applying perceived motion energy to H.264 video. In: International Conference on Image Processing, pp. 2777–2780 (2004)

  19. Lai, W., Gu, X.D., Wang, R.H., Dai, L.R., Zhang, H.J.: Perceptual video streaming by adaptive spatial-temporal scalability. Advances in Multimedia Information Processing—PCM 2004. Lecture Notes in Computer Science (3332), pp. 431–438. Springer, Berlin, (2004)

  20. Cha H.J., Oh J.H., Ha R. (2003). Dynamic frame dropping for bandwidth control in MPEG streaming system. Multimed. Tools Appl. 19(2): 155–178

    Article  Google Scholar 

  21. Hsiang S.T., Woods J.W. (2001). Embedded video coding using invertible motion compensated 3-D subband/wavelet filter bank. Signal Process. Image Commun. 16(8): 705–724

    Article  Google Scholar 

  22. Sprljan, N., Mrak, M., Abhayaratne, G.C.K., Izquierdo, E.: A scalable coding framework for efficient video adaptation. In: Proceedings of International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) (2005)

  23. Ohm J.R. (1994). Three-dimensional subband coding with motion compensation. Image Process. IEEE Trans. 3(5): 559–571

    Article  Google Scholar 

  24. Fonseca P.M., Pereira F. (2004). Automatic video summarization based on MPEG-7 descriptions. Signal Process. Image Commun. 19(8): 685–699

    Article  Google Scholar 

  25. van Beek P., Smith J.R., Ebrahimi T., Suzuki T., Askelof J. (2003). Metadata-driven multimedia access. Signal Process. Mag. IEEE 20(2): 40–52

    Article  Google Scholar 

  26. Shen, K., Delp, E.J.: A fast algorithm for video parsing using MPEG compressed sequences. In: Proceedings of the International Conference on Image Processing, pp. 252–255 (1995)

  27. Wang H.L., Divakaran A., Vetro A., Chang S.F., Sun H.F. (2003). Survey of compressed-domain features used in audio-visual indexing and analysis. J. Vis. Commun. Image Represent. 14(2): 150–183

    Article  Google Scholar 

  28. Bescos J. (2004). Real-time shot change detection over online MPEG-2 video. Circuits Syst. Video Technol. IEEE Trans. 14(4): 475–484

    Article  Google Scholar 

  29. Jeannin S., Divakaran A. (2001). MPEG-7 visual motion descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6): 720–724

    Article  Google Scholar 

  30. Tan Y.-P., Saur D.D., Kulkami S.R., Ramadge P.J. (2000). Rapid estimation of camera motion from compressed video with application to video annotation. Circuits Syst. Video Technol. IEEE Trans. 10(1): 133–146

    Article  Google Scholar 

  31. Babu R.V., Ramakrishnan K.R., Srinivasan S.H. (2004). Video object segmentation: a compressed domain approach. Circuits Syst. Video Technol. IEEE Trans. 14(4): 462–474

    Article  Google Scholar 

  32. Mukherjee D., Delfosse E., Kim J.G., Wang Y. (2005). Optimal adaptation decision-taking for terminal and network quality- of-service. IEEE Trans. Multimed. 7(3): 454–462

    Article  Google Scholar 

  33. Chan, M.H., Yu, Y.B., Constantinides, A.G.: Variable size block matching motion compensation with applications to video coding. Communications, Speech and Vision, IEE Proceedings I, pp. 205–212 (1990)

  34. Herranz, L., Tiburzi, F., Bescós, J.: Extraction of motion activity from scalable-coded video sequences. Semantic Multimedia, Lecture Notes in Computer Science (4306), pp. 148–158. Springer, Berlin, (2006)

  35. Hamidi M., Pearl J. (1976). Comparison of the cosine and Fourier transforms of Markov-1 signals. IEEE Trans. Signal Process. Acoust. Speech Signal Process. 24(5): 428–429

    Article  Google Scholar 

  36. Ahmad I., Wei X., Sun Y., Zhang Y.-Q. (2005). Video transcoding: an overview of various techniques and research issues. IEEE Trans. Multimed. 7(5): 793–804

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luis Herranz.

Additional information

Work supported by the Ministerio de Ciencia y Tecnología of the Spanish Government under project TIN2004-07860 (MEDUSA) and by the Comunidad de Madrid under project S-0505-TIC-0223 (PROMULTIDIS).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Herranz, L. Integrating semantic analysis and scalable video coding for efficient content-based adaptation. Multimedia Systems 13, 103–118 (2007). https://doi.org/10.1007/s00530-007-0090-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00530-007-0090-0

Keywords

Navigation