Abstract
A comprehensive method for movie abstraction is developed in this research for applications in fast movie content exploring, indexing, browsing, and skimming, Most current approaches rely heavily on specific domain knowledge or models to identify and extract the determining scenes of a given movie; however, the segments extracted are often isolated, presenting a fragmented outline of the original. Our proposed method fuses simple audiovisual features, and measures the “tempos” of a movie directly, especially that of long-term ones. These tempos form a curve that catches the high-level semantics of a movie, indicating the events of interests named as “story intensity.” Through tempo, the proposed algorithm provides a natural way that segments a movie into manageable parts. As our experimental results demonstrate, the condensed skimming clips efficiently extract semantic content that contains the most interesting and informative parts of the original movie.
Similar content being viewed by others
References
Benini S, Migliorati P, Leonardi R (2007) A statistical framework for video skimming based on logical story units and motion activity. In: Proceedings of international workshop on content-based multimedia Indexing. IEEE, Piscataway, pp 152–156
Block B (2001) The visual story: seeing the structure of film, TV, and new media. Focal, Boston
Fischer S, Lienhart R, Effelsberg W (1995) Automatic recognition of film genres. In: Proceedings of international ACM conference on multimedia. ACM, New York, pp 295–304
Gargi U, Kasturi R, Strayer SH (2000) Performance characterization of video-shot-change detection methods. IEEE Trans Circuits Syst Video Technol 10(1):1–13
Gong Y, Sin L-T, Chuan C-H, Zhang H-J, Sakauchi M (1995) Automatic parsing of TV soccer programs. In: Proceedings of the international conference on multimedia computing and systems. IEEE, Piscataway, pp 167–174
Gouyon F, Pachet F, Delerue O (2000) On the use of zero-crossing rate for an application of classification of percussive sounds. In: Proceedings of the COST G-6 conference on digital audio effects, Verona, 7–9 December 2000, pp 1–6
Hanjalic A (2003) Generic approach to highlights extraction from a sport video. In: Proceedings of the IEEE international conference on image processing. IEEE, Piscataway, pp 1–4
Hanjalic A (2003) Multimodal approach to measuring excitement in video. In: Proceedings of the IEEE international conferences on multimedia and expo. IEEE, Piscataway, pp 289–292
Huang CL, Liao BY (2001) A robust scene-change detection method for video segmentation. IEEE Trans Circuit Syst Video Technol 11(12):1281–1288
Jasinschi RS, Dimitrova N, McGee T, Agnihotri L, Zimmerman J, Li D, Louie J (2002) A probabilistic layered framework for integrating multimedia content and context information. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing. IEEE, Piscataway, pp 2057–2060
Lee S-H, Yeh C-H, Jay Kuo C-C (2004) Automatic movie skimming with story units via general tempo analysis. In: Proceedings of SPIE electronic image storage and retrieval methods and applications for multimedia, vol 5307. SPIE, Bellingham, pp 396–407
Li Y (2002) Content-based video analysis, indexing and representation using multimodal information. PhD dissertation, USC
Li Y, Jay Kuo C-C (2004) Video content analysis using multimodal information. Kluwer, Dordrecht
Li Y, Lee S-H, Yeh C-H, Jay Kuo C-C (2006) Techniques for movie content analysis and skimming. IEEE Signal Process Mag 23(2):79–89
Liu Z, Huang J, Wang Y (1998) Classification of TV programs based on audio information using hidden Markov model. In: Proceedings of IEEE workshop multimedia signal processing. IEEE, Piscataway, pp 27–32
Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Circuits Syst Video Technol 7(5):907–919
Naphade MR, Kozintsev IV, Huang TS (2002) A factor graph framework for semantics video indexing. IEEE Trans Circuit Syst Video Technol 12(1):40–52
Ngo C-W, Ma Y-F, Zhang H-J (2005) Video summarization and scene detection by graph modeling. IEEE Trans Circuits Syst Video Technol 15(2):296–305
Pfeiffer S, Lienhart R, Fischer S, Effelsberg W (1996) Abstracting digital movies automatically. J Vis Commun Image Represent 7(4):345–353
Scheirer ED (1998) Tempo and beat analysis of acoustic musical signals. J Acoust Soc Am 103(1):588–601
Sharff S (1982) The elements of cinema: towards a theory of cinesthetic impact. Columbia University Press, New York
Smith M, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech Rep CMU-CS-95-186, 1–12
Smith M, Kanade T (1997) Video skimming and characterization through the combination of image and language understanding techniques. In: Proceedings of the IEEE computer vision and pattern recognition. IEEE, Piscataway, pp 775–781
Sundaram H, Chang S-F (2000) Determining computable scenes in films and their structures using audio-visual memory models. In: Proceedings of the eighth ACM international conference on multimedia. ACM, New York, pp 95–104
Sundaram H, Chang S-F (2001) Condensing computable scenes using visual complexity and film syntax analysis. In: Proceedings of the IEEE international conference on multimedia and expo. IEEE, Piscataway, pp 389–392
Sundaram H, Kie L, Chang S-F (2002) A utility framework for the automatic generation of audio-visual skims. In: Proceedings of international ACM conference on multimedia. ACM, New York, pp 189–198
Toklu C, Liou SP (2000) Automatic keyframe selection for content-based video indexing and access. Proc SPIE 3972:554–563
Wang Y, Liu Z, Huang J-C (2000) Multimedia content analysis: using both audio and visual clues. IEEE Signal Process Mag 17(6):12–36
Yeh C-H, Lee S-H, Jay Kuo C-C (2005) Content-based video analysis for knowledge discovery. In: Chen CH, Wang PSP (eds) Handbook of pattern recognition and computer vision 3th edition version. World Scientific, Singapore. ISBN: 981-256-105-6
Yeo BL, Liu B (1995) Rapid scene analysis on compressed video. IEEE Trans Circuits Syst Video Technol 5(6):533–544
Zhai S-L, Luo B, Tang J, Zhang C-Y (2007) Video abstraction based on relational graphs. In: Proceedings of the fourth international conference on image and graphics. IEEE, Piscataway, pp 827–832
Zhang T, Jay Kuo C-C (1999) Heuristic approach for generic audio data segmentation and annotation. In: Proceedings of the seventh ACM international conference on multimedia. ACM, New York, pp 67–76
Zhang HJ, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1(1):10–28
Zhou W, Dao S, Jay Kuo C-C (2002) On-line knowledge-based and rule-based video classification system for video indexing and dissemination. Inf Syst 27:559–586
Acknowledgements
The authors would like to thank the the National Science Council of the Republic of China for financially supporting this research under Contracts No. NSC95-2218-E-259-047 and NSC96-2628-E-110-020-MY2.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yeh, CH., Kuo, CH. & Liou, RW. Movie story intensity representation through audiovisual tempo analysis. Multimed Tools Appl 44, 205–228 (2009). https://doi.org/10.1007/s11042-009-0278-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-009-0278-8