Abstract
Scalable video coding has become a key technology to deploy systems where the adaptation of content to diverse constrained usage environments (such as PDAs, mobile phones and networks) is carried out in a simple and efficient way. Content-based adaptation and summarization are fields that aim for providing improved adaptation to the user, trying to optimize the semantic coverage in the adapted/summarized version. This paper proposes the integration of content analysis with scalable video adaptation paradigm. They must be fitted in such a way that the efficiency of scalable adaptation is not damaged. An integrated framework is proposed for semantic video adaptation, as well as an adaptive skimming scheme that can use the results of semantic analysis. They are described using the MPEG-21 DIA tools to provide the adaptation in a standard framework. Particularly, the case of activity analysis is described to illustrate the integration of semantic analysis in the framework, and its use for online content summarization and adaptation. Overall efficiency is achieved by means of computing activity using compressed domain analysis with several metrics evaluated as measures of activity.
Similar content being viewed by others
References
Chang S.-F., Vetro A. (2005). Video adaptation: concepts, technologies and open issues. Proc. IEEE 93(1): 148–158
Vetro A. (2004). MPEG-21 digital item adaptation: enabling universal multimedia access. IEEE Multimed. 11(1): 84–87
Ohm J.R. (2005). Advances in scalable video coding. Proc. IEEE 93(1): 42–56
Ohm J.R., Woods J.W., Schaar M. (2004). Interframe wavelet coding motion picture representation for universal scalability. Signal Process. Image Commun. 19(9): 877–908
Schwarz, H., Marpe, D., Wiegand, T.: Overview of the scalable H.264/MPEG4-AVC extension. In: Proceedings of International Conference on Image Processing, (2006)
Pereira F., Van Beek P., Kot A.C., Ostermann J. (2005). Special issue on analysis and understanding for video adaptation. IEEE Trans. Circuits Syst. Video Technol. 15(10): 1197–1199
Dimitrova N., Zhang H.-J., Shahraray B., Sezan I., Huang T., Zakhor A. (2002). Applications of video-content analysis and retrieval. Multimed. IEEE 9(3): 42–55
Furini, M., Ghini, V.: A video frame dropping mechanism based on audio perception. IEEE Global Telecommunications Conference Workshops, pp. 211–216 (2004)
Yeung M.M., Yeo B.-L. (1997). Video visualization for compact presentation and fast browsing of pictorial content. Circuits Syst. Video Technol. IEEE Trans. 7(5): 771–785
Chang H.S., Sull S., Lee S.U. (1999). Efficient video indexing scheme for content-based retrieval. Circuits Syst. Video Technol. IEEE Trans. 9(8): 1269–1279
Pfeiffer S., Lienhart R., Fischer S., Effelsberg W. (1996). Abstracting digital movies automatically. J. Vis. Commun. Image Represent 7(4): 345–353
Zhu X., Elmagarmid A.K., Xue X., Wu L., Catlin A.C. (2005). InsightVideo: toward hierarchical video content organization for efficient browsing, summarization and retrieval. Multimed. IEEE Trans. 7(4): 648–666
Peker, K.A., Divakaran, A., Sun, H.: Constant pace skimming and temporal sub-sampling of video using motion activity. In: Proceedings of International Conference on. Image Processing, pp. 414–417 (2001)
Ma Y.-F., Hua X.-S., Lu L., Zhang H.-J. (2005). A generic framework of user attention model and its application in video summarization. Multimed. IEEE Trans. 7(5): 907–919
Li Z., Schuster G.M., Katsaggelos A.K., Gandhi B. (2005). Rate- distortion optimal video summary generation. IEEE Trans. Image Process. 14(10): 1550–1560
Ngo, C.-W., Ma, Y.-F., Zhang, H.-J.: Automatic video summarization by graph modeling. In: Proceedings of Ninth IEEE International Conference on Computer Vision, pp. 104–109 (2003)
Smith, M.A., Kanade, T.: Video skimming and characterization through the combination of image and language understanding. In: Proceedings of IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 61–70 (1998)
Gang, Z., Chia, L.T., Zongkai, Y.: MPEG-21 digital item adaptation by applying perceived motion energy to H.264 video. In: International Conference on Image Processing, pp. 2777–2780 (2004)
Lai, W., Gu, X.D., Wang, R.H., Dai, L.R., Zhang, H.J.: Perceptual video streaming by adaptive spatial-temporal scalability. Advances in Multimedia Information Processing—PCM 2004. Lecture Notes in Computer Science (3332), pp. 431–438. Springer, Berlin, (2004)
Cha H.J., Oh J.H., Ha R. (2003). Dynamic frame dropping for bandwidth control in MPEG streaming system. Multimed. Tools Appl. 19(2): 155–178
Hsiang S.T., Woods J.W. (2001). Embedded video coding using invertible motion compensated 3-D subband/wavelet filter bank. Signal Process. Image Commun. 16(8): 705–724
Sprljan, N., Mrak, M., Abhayaratne, G.C.K., Izquierdo, E.: A scalable coding framework for efficient video adaptation. In: Proceedings of International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS) (2005)
Ohm J.R. (1994). Three-dimensional subband coding with motion compensation. Image Process. IEEE Trans. 3(5): 559–571
Fonseca P.M., Pereira F. (2004). Automatic video summarization based on MPEG-7 descriptions. Signal Process. Image Commun. 19(8): 685–699
van Beek P., Smith J.R., Ebrahimi T., Suzuki T., Askelof J. (2003). Metadata-driven multimedia access. Signal Process. Mag. IEEE 20(2): 40–52
Shen, K., Delp, E.J.: A fast algorithm for video parsing using MPEG compressed sequences. In: Proceedings of the International Conference on Image Processing, pp. 252–255 (1995)
Wang H.L., Divakaran A., Vetro A., Chang S.F., Sun H.F. (2003). Survey of compressed-domain features used in audio-visual indexing and analysis. J. Vis. Commun. Image Represent. 14(2): 150–183
Bescos J. (2004). Real-time shot change detection over online MPEG-2 video. Circuits Syst. Video Technol. IEEE Trans. 14(4): 475–484
Jeannin S., Divakaran A. (2001). MPEG-7 visual motion descriptors. IEEE Trans. Circuits Syst. Video Technol. 11(6): 720–724
Tan Y.-P., Saur D.D., Kulkami S.R., Ramadge P.J. (2000). Rapid estimation of camera motion from compressed video with application to video annotation. Circuits Syst. Video Technol. IEEE Trans. 10(1): 133–146
Babu R.V., Ramakrishnan K.R., Srinivasan S.H. (2004). Video object segmentation: a compressed domain approach. Circuits Syst. Video Technol. IEEE Trans. 14(4): 462–474
Mukherjee D., Delfosse E., Kim J.G., Wang Y. (2005). Optimal adaptation decision-taking for terminal and network quality- of-service. IEEE Trans. Multimed. 7(3): 454–462
Chan, M.H., Yu, Y.B., Constantinides, A.G.: Variable size block matching motion compensation with applications to video coding. Communications, Speech and Vision, IEE Proceedings I, pp. 205–212 (1990)
Herranz, L., Tiburzi, F., Bescós, J.: Extraction of motion activity from scalable-coded video sequences. Semantic Multimedia, Lecture Notes in Computer Science (4306), pp. 148–158. Springer, Berlin, (2006)
Hamidi M., Pearl J. (1976). Comparison of the cosine and Fourier transforms of Markov-1 signals. IEEE Trans. Signal Process. Acoust. Speech Signal Process. 24(5): 428–429
Ahmad I., Wei X., Sun Y., Zhang Y.-Q. (2005). Video transcoding: an overview of various techniques and research issues. IEEE Trans. Multimed. 7(5): 793–804
Author information
Authors and Affiliations
Corresponding author
Additional information
Work supported by the Ministerio de Ciencia y Tecnología of the Spanish Government under project TIN2004-07860 (MEDUSA) and by the Comunidad de Madrid under project S-0505-TIC-0223 (PROMULTIDIS).
Rights and permissions
About this article
Cite this article
Herranz, L. Integrating semantic analysis and scalable video coding for efficient content-based adaptation. Multimedia Systems 13, 103–118 (2007). https://doi.org/10.1007/s00530-007-0090-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00530-007-0090-0