Abstract
In this paper, we present a spatio-temporal event-based approach to video signal analysis and abstraction employing wavelet transform features. The video signal is assumed to be a sequence of overlapping independent visual components called events, which typically are temporally overlapping compact functions that describe temporal evolution of a given set of the spatial parameters of the video signal. We utilize event-based temporal decomposition technique to resolve the overlapping arrangement of the video signal that is known to be one of the main concerns in video analysis via conventional frame-based schemes. In our method, a set of spatial parameters, extracted from the video, is expressed as a linear combination of a set of temporally overlapping compact functions, called events, through an optimization process. First, to reduce computational complexity, the video sequence is divided into overlapped groups. Next, Generalized Gaussian Density (GGD) parameters, extracted from 2D wavelet transform subbands, are used as the spatial parameters. Temporal decomposition is then applied to the GGD parameters, structured as a frame-based matrix of GGD vectors, to compute the event functions and associated orthogonal GGD parameters. Frames located at event centroids, which are much smaller in number than the number of frames in the original video, are taken as candidates for the keyframes that are selected based on a distance criterion in the feature space. Our contribution is that this still image video abstraction scheme does not need shot or cluster boundary detection, unlike current methods. Experimental results confirm the efficiency and accuracy of our approach.
Similar content being viewed by others
References
Truong B.T., Venkatesh S.: Video abstraction: a systematic review and classification. ACM Trans. Multimed. Comput. Commun. Appl. 3, 1–37 (2007). doi:10.1145/1198302.1198305
Li Y., Lee S.H., Yeh S.H., Kuo C.-C.J.: Techniques for movie content analysis and skimming. In: IEEE Signal Process. Mag. 23, 79–89 (2006). doi:10.1109/MSP.2006.1621451
Mallat S.: A theory for multiresolution signal decomposition: the wavelet representation. In: IEEE Trans. Patt. Recognit. Mach. Intell. 11(7), 674–693 (1989). doi:10.1109/34.192463
Oh, T.H., Besar, R.: JPEG2000 and JPEG: image quality measures of compressed medical images. In: 4th National Conference on Telecommunication Technology, NCTT Proceedings, pp. 31–35. (2003). doi:10.1109/NCTT.2003.1188296
Simoncelli, E.P., Duccigrossi, R.W.: Embedded wavelet image compression based on a joint property model. In: IEEE International Conference Image Processing, vol. 1, pp. 640–643 (1997). doi:10.1109/ICIP.1997.647994
Do, M.N.: Directional multiresolution image representations. PhD thesis. Swiss Federal Institute of Technology (2001)
Zhuang, Y., Rui, Y., Huang, T.S., Mehrotra, S.: Adaptive key frame extraction using unsupervised clustering. In: IEEE International Conference on Image Processing, pp. 283–287. (1998). doi:10.1109/ICIP.1998.723655
Nagasaka, A., Tanaka, Y.: Automatic video indexing and full-video search for object appearances. In: Visual Database System, II, vol. 15(2), pp. 113–127. Elsevier, North-Holland (1992)
Chen W., Zhang Y.J.: Parametric model for video content analysis. Pattern Recognit. Lett. 29, 181–191 (2008). doi:10.1016/j.patrec.2007.09.020
Manor, L.Z., Irani, M.: Event-Based Video Analysis. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 123–130. (2001). doi:10.1109/CVPR.2001.990935
Janvier B., Bruno E., Pun T., Maillet S.M.: Information-theoretic temporal segmentation of video and applications: multiscale keyframes selection and shot boundaries detection. Multimed. Tools Appl. 3(3), 273–288 (2006). doi:10.1007/s11042-006-0026-2
Bulut, E., Capin, T.: Key frame extraction from motion capture data by curve saliency. In: Computer Animation and Social Agents, CASA (2007)
Shao, L., Ji, L.: Motion histogram analysis based key frame extraction for human action/activity representation. In: 6th Canadian Conference on Computer and Robot Vision, CRV, pp. 88–92. (2009). doi:10.1109/CRV.2009.36
Cooper, M.L., Foote, J.: Discriminative techniques for keyframe selection. In: International Conference on Multimedia and Expo, ICME, pp. 502–505. (2005). doi:10.1109/ICME.2005.1521470
Polana, R., Nelson, R.: Detecting activities. In: IEEE Conference on Computer Vision and Pattern Recognition. pp. 2–5. (1993). doi:10.1109/CVPR.1993.341009
Atal, B.S.: Efficient coding of LPC parameters by temporal decomposition. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP, pp. 81–84. (1983). doi:10.1109/ICASSP.1983.1172248
Ghaemmaghami, S.: Audio segmentation and classification based on a selective analysis scheme. In: 10th International Multimedia Modelling Conference, MMM, pp. 42–47. (2004). doi:10.1109/MULMM.2004.1264965
Manjunath, B.S., Chandrasekaran, S., Wang, Y.F.: An eigenspace update algorithm for image analysis. In: International Symposium on Computer Vision, pp. 551–556. (1995). doi:10.1109/ISCV.1995.477059
http://www.irisa.fr/vista/Equipe/People/Laptev/download.html 2011). Accessed 15 April 2011
http://www-nlpir.nist.gov/projects/trecvid (2011). National Institute of Standards and Technology (NIST). Accessed 15 April 2011
http://nsl.cs.sfu.ca/wiki/index.php/Video_Library_and_Tools (2011). Accessed 15 April 2011
http://www.open-video.org (2011). Accessed 15 April 2011
Pickering M.J., Ryger S.: Evaluation of key frame-based retrieval techniques for video. Comput. Vis. Image Underst. 92(2–3), 217–235 (2003). doi:10.1016/j.cviu.2003.06.002
Liu T., Zhang H.J., Qi F.: A novel video Key-frame extraction algorithm based on perceived motion energy model. In: IEEE Trans. Circuits Syst. Video Technol. 13(10), 1006–1013 (2003). doi:10.1109/TCSVT.2003.816521
Cover T.M., Thomes J.A.: Elements of Information Theory. Wiley, New York (1991)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Omidyeganeh, M., Ghaemmaghami, S. & Shirmohammadi, S. Group-based spatio-temporal video analysis and abstraction using wavelet parameters. SIViP 7, 787–798 (2013). https://doi.org/10.1007/s11760-011-0268-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11760-011-0268-y