Skip to main content
Log in

Movie story intensity representation through audiovisual tempo analysis

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

A comprehensive method for movie abstraction is developed in this research for applications in fast movie content exploring, indexing, browsing, and skimming, Most current approaches rely heavily on specific domain knowledge or models to identify and extract the determining scenes of a given movie; however, the segments extracted are often isolated, presenting a fragmented outline of the original. Our proposed method fuses simple audiovisual features, and measures the “tempos” of a movie directly, especially that of long-term ones. These tempos form a curve that catches the high-level semantics of a movie, indicating the events of interests named as “story intensity.” Through tempo, the proposed algorithm provides a natural way that segments a movie into manageable parts. As our experimental results demonstrate, the condensed skimming clips efficiently extract semantic content that contains the most interesting and informative parts of the original movie.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19

Similar content being viewed by others

References

  1. Benini S, Migliorati P, Leonardi R (2007) A statistical framework for video skimming based on logical story units and motion activity. In: Proceedings of international workshop on content-based multimedia Indexing. IEEE, Piscataway, pp 152–156

    Chapter  Google Scholar 

  2. Block B (2001) The visual story: seeing the structure of film, TV, and new media. Focal, Boston

    Google Scholar 

  3. Fischer S, Lienhart R, Effelsberg W (1995) Automatic recognition of film genres. In: Proceedings of international ACM conference on multimedia. ACM, New York, pp 295–304

    Chapter  Google Scholar 

  4. Gargi U, Kasturi R, Strayer SH (2000) Performance characterization of video-shot-change detection methods. IEEE Trans Circuits Syst Video Technol 10(1):1–13

    Article  Google Scholar 

  5. Gong Y, Sin L-T, Chuan C-H, Zhang H-J, Sakauchi M (1995) Automatic parsing of TV soccer programs. In: Proceedings of the international conference on multimedia computing and systems. IEEE, Piscataway, pp 167–174

    Chapter  Google Scholar 

  6. Gouyon F, Pachet F, Delerue O (2000) On the use of zero-crossing rate for an application of classification of percussive sounds. In: Proceedings of the COST G-6 conference on digital audio effects, Verona, 7–9 December 2000, pp 1–6

  7. Hanjalic A (2003) Generic approach to highlights extraction from a sport video. In: Proceedings of the IEEE international conference on image processing. IEEE, Piscataway, pp 1–4

    Google Scholar 

  8. Hanjalic A (2003) Multimodal approach to measuring excitement in video. In: Proceedings of the IEEE international conferences on multimedia and expo. IEEE, Piscataway, pp 289–292

    Google Scholar 

  9. Huang CL, Liao BY (2001) A robust scene-change detection method for video segmentation. IEEE Trans Circuit Syst Video Technol 11(12):1281–1288

    Article  Google Scholar 

  10. Jasinschi RS, Dimitrova N, McGee T, Agnihotri L, Zimmerman J, Li D, Louie J (2002) A probabilistic layered framework for integrating multimedia content and context information. In: Proceedings of the IEEE international conference on acoustics, speech and signal processing. IEEE, Piscataway, pp 2057–2060

    Google Scholar 

  11. Lee S-H, Yeh C-H, Jay Kuo C-C (2004) Automatic movie skimming with story units via general tempo analysis. In: Proceedings of SPIE electronic image storage and retrieval methods and applications for multimedia, vol 5307. SPIE, Bellingham, pp 396–407

    Google Scholar 

  12. Li Y (2002) Content-based video analysis, indexing and representation using multimodal information. PhD dissertation, USC

  13. Li Y, Jay Kuo C-C (2004) Video content analysis using multimodal information. Kluwer, Dordrecht

    Google Scholar 

  14. Li Y, Lee S-H, Yeh C-H, Jay Kuo C-C (2006) Techniques for movie content analysis and skimming. IEEE Signal Process Mag 23(2):79–89

    Article  MATH  Google Scholar 

  15. Liu Z, Huang J, Wang Y (1998) Classification of TV programs based on audio information using hidden Markov model. In: Proceedings of IEEE workshop multimedia signal processing. IEEE, Piscataway, pp 27–32

    Google Scholar 

  16. Ma Y-F, Hua X-S, Lu L, Zhang H-J (2005) A generic framework of user attention model and its application in video summarization. IEEE Trans Circuits Syst Video Technol 7(5):907–919

    Google Scholar 

  17. Naphade MR, Kozintsev IV, Huang TS (2002) A factor graph framework for semantics video indexing. IEEE Trans Circuit Syst Video Technol 12(1):40–52

    Article  Google Scholar 

  18. Ngo C-W, Ma Y-F, Zhang H-J (2005) Video summarization and scene detection by graph modeling. IEEE Trans Circuits Syst Video Technol 15(2):296–305

    Article  Google Scholar 

  19. Pfeiffer S, Lienhart R, Fischer S, Effelsberg W (1996) Abstracting digital movies automatically. J Vis Commun Image Represent 7(4):345–353

    Article  Google Scholar 

  20. Scheirer ED (1998) Tempo and beat analysis of acoustic musical signals. J Acoust Soc Am 103(1):588–601

    Article  Google Scholar 

  21. Sharff S (1982) The elements of cinema: towards a theory of cinesthetic impact. Columbia University Press, New York

    Google Scholar 

  22. Smith M, Kanade T (1995) Video skimming for quick browsing based on audio and image characterization. Tech Rep CMU-CS-95-186, 1–12

  23. Smith M, Kanade T (1997) Video skimming and characterization through the combination of image and language understanding techniques. In: Proceedings of the IEEE computer vision and pattern recognition. IEEE, Piscataway, pp 775–781

    Chapter  Google Scholar 

  24. Sundaram H, Chang S-F (2000) Determining computable scenes in films and their structures using audio-visual memory models. In: Proceedings of the eighth ACM international conference on multimedia. ACM, New York, pp 95–104

    Chapter  Google Scholar 

  25. Sundaram H, Chang S-F (2001) Condensing computable scenes using visual complexity and film syntax analysis. In: Proceedings of the IEEE international conference on multimedia and expo. IEEE, Piscataway, pp 389–392

    Google Scholar 

  26. Sundaram H, Kie L, Chang S-F (2002) A utility framework for the automatic generation of audio-visual skims. In: Proceedings of international ACM conference on multimedia. ACM, New York, pp 189–198

    Google Scholar 

  27. Toklu C, Liou SP (2000) Automatic keyframe selection for content-based video indexing and access. Proc SPIE 3972:554–563

    Article  Google Scholar 

  28. Wang Y, Liu Z, Huang J-C (2000) Multimedia content analysis: using both audio and visual clues. IEEE Signal Process Mag 17(6):12–36

    Article  Google Scholar 

  29. Yeh C-H, Lee S-H, Jay Kuo C-C (2005) Content-based video analysis for knowledge discovery. In: Chen CH, Wang PSP (eds) Handbook of pattern recognition and computer vision 3th edition version. World Scientific, Singapore. ISBN: 981-256-105-6

    Google Scholar 

  30. Yeo BL, Liu B (1995) Rapid scene analysis on compressed video. IEEE Trans Circuits Syst Video Technol 5(6):533–544

    Article  Google Scholar 

  31. Zhai S-L, Luo B, Tang J, Zhang C-Y (2007) Video abstraction based on relational graphs. In: Proceedings of the fourth international conference on image and graphics. IEEE, Piscataway, pp 827–832

    Chapter  Google Scholar 

  32. Zhang T, Jay Kuo C-C (1999) Heuristic approach for generic audio data segmentation and annotation. In: Proceedings of the seventh ACM international conference on multimedia. ACM, New York, pp 67–76

    Chapter  Google Scholar 

  33. Zhang HJ, Kankanhalli A, Smoliar SW (1993) Automatic partitioning of full-motion video. Multimedia Syst 1(1):10–28

    Article  Google Scholar 

  34. Zhou W, Dao S, Jay Kuo C-C (2002) On-line knowledge-based and rule-based video classification system for video indexing and dissemination. Inf Syst 27:559–586

    Article  MATH  Google Scholar 

Download references

Acknowledgements

The authors would like to thank the the National Science Council of the Republic of China for financially supporting this research under Contracts No. NSC95-2218-E-259-047 and NSC96-2628-E-110-020-MY2.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chia-Hung Yeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yeh, CH., Kuo, CH. & Liou, RW. Movie story intensity representation through audiovisual tempo analysis. Multimed Tools Appl 44, 205–228 (2009). https://doi.org/10.1007/s11042-009-0278-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-009-0278-8

Keywords

Navigation