On-line video abstract generation of multimedia news

Valdés, Víctor; Martínez, José M.

doi:10.1007/s11042-011-0774-5

On-line video abstract generation of multimedia news

Published: 23 March 2011

Volume 59, pages 795–832, (2012)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Víctor Valdés¹ &
José M. Martínez¹

396 Accesses
8 Citations
Explore all metrics

Abstract

The amount of video content available nowadays makes video abstraction techniques a necessary tool to ease the access to the already huge and ever growing video databases. Nevertheless, many of the existing video abstraction approaches have high computational requirements, complicating the integration and exploitation of current technologies in real environments. This paper presents a novel method for news bulletin abstraction which combines on-line story segmentation, on-line video skimming and layout composition techniques. The developed algorithm provides an efficient, automatic and on-line news abstraction method which takes advantage of the specific characteristics of news bulletins for obtaining representative news abstracts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

User-centred personalised video abstraction approach adopting SIFT features

Article 14 January 2016

Long-Text-to-Video-GAN

Notes

http://www.mesh-ip.eu
http://sourceforge.net/projects/opencvlibrary/
Hardware platform: Intel Core 2 Duo @2.53 GHz with 4 GB of RAM.
Due to the lack of knowledge about the Chinese language.

References

Aigrain P, Zhang H, Petkovic D (1996) Content-based representation and retrieval of visual media: a state-of-the-art review. Multimed Tools Appl 3(3):179–202. Available: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.50.5650 (Online)
Article Google Scholar
Browne P, Czirjek C, Gaughan G, Gurrin C, Jones GJF, Lee H, Marlow S, Donald KM, Murphy N, O’connor NE, Smeaton AF, Ye J (2003) Dublin City University video track experiments for trec 2003. In: TREC video retrieval evaluation online proceedings
Calic J, Izquierdo E (2002) Efficient key-frame extraction and video analysis. In: International conference on Information technology: coding and computing, vol 0, p 0028
Chaisorn L, Chua T-S, Lee C-H (2002) The segmentation of news video into story units. In: ICME ’02: proceedings of the IEEE international conference on multimedia and expo, vol 1, pp 73–76
Chang CC, Lin CJ (2001) LIBSVM: a library for support vector machines. Available: http://www.csie.ntu.edu.tw/~cjlin/libsvm (Online)
Chang HS, Sull S, Lee SU (1999) Efficient video indexing scheme for content-based retrieval. IEEE Trans Circuits Syst Video Technol 9(8):1269–1279
Article Google Scholar
Chen F, Adcock J, Cooper M (2008) A simplified approach to rushes summarization. In: TVS ’08: proceedings of the 2nd ACM TRECVid video summarization workshop. ACM, New York, pp 60–64
Chapter Google Scholar
Chien HJ, Smoliar SW, Wu JH (1995) Video parsing, retrieval and browsing: An integrated and content-based solution. In: MULTIMEDIA ’95: proceedings of the third ACM international conference on multimedia
Christel MG (2006) Evaluation and user studies with respect to video summarization and browsing. In: SPIE MCAMR ’06: proceedings of the conference on multimedia content analysis, management, and retrieval, vol 6073, no 1, pp 196–210
Chua T-S, Chang S-F, Chaisorn L, Hsu W (2004) Story boundary detection in large broadcast news video archives: techniques, experience and trends. In: MULTIMEDIA ’04: proceedings of the 12th annual ACM international conference on multimedia. ACM, New York, pp 656–659
Chapter Google Scholar
Ciocca G, Schettini R (2005) Dynamic key-frame extraction for video summarization. In: Santini S, Schettini R, Gevers T (eds) Internet imaging VI, vol 5670, no 1. SPIE, San Jose, pp 137–142. Available: http://link.aip.org/link/?PSI/5670/137/1 (Online)
Google Scholar
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20(3):273–297
MATH Google Scholar
Divakaran A, Radhakrishnan R, Peker K (2002) Motion activity-based extraction of key-frames from video shots. In: ICIP ’02: proceedings of the 2002 international conference on image processing, vol 1, pp I-932–I-935
Fayzullin M, Subrahmanian VS, Picariello A, Sapino ML (2003) The cpr model for summarizing video. In: MMDB ’03: proceedings of the 1st ACM international workshop on multimedia databases. ACM, New York, pp 2–9
Chapter Google Scholar
Gunsel B, Tekalp A (1998) Content-based video abstraction. In: ICIP ‘98: proceedings on 1998 International Conference on Image Processing, vol 3, pp 128–132
Hanjalic A, Zhang H (1999) An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis. IEEE Trans Circuits Syst Video Technol 9(8):1280–1289
Article Google Scholar
Hauptmann AG, Witbrock MJ (1998) Story segmentation and detection of commercials in broadcast news video. In: ADL ’98: proceedings of the IEEE advances in digital libraries conference, pp 168–179
Hauptmann AG, Christel MG, Lin W-H, Maher B, Yang J, Baron RV, Xiang G (2007) Clever clustering vs. simple speed-up for summarizing rushes. In: TVS ’07: proceedings of the international workshop on TRECVID video summarization. ACM, New York, pp 20–24
Chapter Google Scholar
Hua X-S, Lu L, Zhang H-J (2004) Optimization-based automated home video editing system. IEEE Trans Circuits Syst Video Technol 14(5):572–583
Article Google Scholar
Huang Q, Liu Z, Rosenberg A, Gibbon D, Shahraray B (1999) Automated generation of news content hierarchy by integrating audio, video, and text information. In: ICASSP ’99: proceedings of the 1999 international conference on acoustics, speech, and signal processing
Ju S, Black M, Minneman S, Kimber D (1998) Summarization of videotaped presentations: automatic analysis of motion and gesture. IEEE Trans Circuits Syst Video Technol 8(5):686–696
Article Google Scholar
Kasutani E, Yamada A (2001) The mpeg-7 color layout descriptor: a compact image feature description for high-speed image/video segment retrieval. In: ICIP ’01: proceedings of the 2001 international conference on image processing, vol 1, pp 674–677
Kim J, Chang H, Kang K, Kim M, Kim J, Kim H (2003) Summarization of news video and its description for content-based access. Int J Imaging Syst Technol 13(5):267–274
Article Google Scholar
Latecki L, DeMenthon D, Rosenfeld A (2001) Extraction of key frames from videos by polygon simplification. In: Sixth international symposium on signal processing and its applications, vol 2, pp 643–646
Li B, Sezan MI (2001) Event detection and summarization in american football broadcast video. In: Proceedings of the conference on storage and retrieval for media databases, vol 4676, no 1, pp 202–213
Li Y, Zhang T, Tretter D (2001) An overview of video abstraction techniques. HP Laboratories, Palo Alto
Google Scholar
Li Z, Schuster G, Katsaggelos A, Gandhi B (2005) Rate-distortion optimal video summary generation. IEEE Trans Image Process 14(10):1550–1560
Article Google Scholar
Li Y, Lee S-H, Yeh C-H, Kuo C-C (2006) Techniques for movie content analysis and skimming: tutorial and overview on video abstraction techniques. IEEE Signal Process Mag 23(2):79–89
Article MATH Google Scholar
Lie W-N, Lai C-M (2004) News video summarization based on spatial and motion feature analysis. In: Aizawa K, Nakamura Y, Satoh S (eds) Advances in multimedia information processing-PCM 2004, ser. Lecture notes in computer science, vol 3332. Springer, Berlin, pp 246–255
Chapter Google Scholar
Lienhart R, Maydt J (2002) An extended set of haar-like features for rapid object detection. In: ICIP ’02: proceedings of the 2002 international conference on image processing, vol 1, pp 900–903
Liu T, Kender JR (2002) Optimization algorithms for the selection of key frame sequences of variable length. In: ECCV ’02: proceedings of the 7th European conference on computer vision-part IV. Springer, London, pp 403–417
Google Scholar
Liu Z, Wang Y (2001) Major cast detection in video using both audio and visual information. In: ICASSP ’01: proceedings of the acoustics, speech, and signal processing on IEEE international conference. IEEE Computer Society, Washington, pp 1413–1416
Google Scholar
Liu T, Zhang H-J, Qi F (2003) A novel video key-frame-extraction algorithm based on perceived motion energy model. IEEE Trans Circuits Syst Video Technol 13(10):1006–1013
Article Google Scholar
Meyer D, Leisch F, Hornik K (2003) The support vector machine under test. Neurocomputing 55(1–2):169–186
Article Google Scholar
Mills M, Cohen J, Wong YY (1992) A magnifier tool for video data. In: CHI ’92: proceedings of the SIGCHI conference on human factors in computing systems. ACM, New York, pp 93–98
Chapter Google Scholar
Money AG, Agius H (2008) Video summarisation: a conceptual framework and survey of the state of the art. J Vis Commun Image Represent 19(2):121–143
Article Google Scholar
Nam J, Tewfik A (1999) Video abstract of video. In: MMSP ’99: proceedings of the IEEE 3rd international workshop on multimedia signal processing, pp 117–122
Naphade MR, Smith JR (2002) On the detection of semantic concepts at trecvid. In: MULTIMEDIA ’04: proceedings of the 12th annual ACM international conference on multimedia. ACM, New York, pp 660–667
Google Scholar
Chaisorn L, Chua TS, Koh CK, Zhao Y, Xu H, Feng H, Tian Q (2003) A two-level multi-modal approach for story segmentation of large news video corpus. In: Proc. of TRECVID conference
Oh J, Wen Q, Hwang S, Lee J (2004) Video abstraction. In: Video data management and information retrieval, pp 321–346
O’hare N, Smeaton A, Czirjek C, O’Connor N, Murphy N (2004) A generic news story segmentation system and its evaluation. In: ICASSP ’04: proceedings of the 2004 international conference on acoustics, speech, and signal processing, vol 3, pp iii-1028–iii-1031
Over P, Smeaton AF, Awad G (2008) The trecvid 2008 bbc rushes summarization evaluation. In: TVS ’08: proceedings of the 2nd ACM TRECVid video summarization workshop. ACM, New York, pp 1–20
Chapter Google Scholar
Peker KA, Divakaran A (2004) Adaptive fast playback-based video skimming using a compressed-domain visual complexity measure. In: ICME. IEEE, Piscataway, pp 2055–2058
Google Scholar
Peker KA, Otsuka I, Divakaran A (2006) Broadcast video program summarization using face tracks. In: ICME. IEEE, Piscataway, pp 1053–1056
Google Scholar
Peker KA, Divakaran A, Lanning T (2005) Browsing news and talk video on a consumer electronics platform using face detection. In: Vetro A, Chen CW, Kuo C-CJ, Zhang T, Tian Q, Smith JR (eds) Multimedia systems and applications VIII, vol 6015, no 1. SPIE, Boston, p 601519. Available: http://link.aip.org/link/?PSI/6015/601519/1 (Online)
Google Scholar
Shipman S, Divakaran A, Flynn M Highlight scene detection and video summarization for pvr-enabled high-definition television systems. In: ICCE ’07: proceedings of the international conference on consumer electronics, pp 1–2
Smeaton A, Over P, Kraaij W (2009) High-level feature detection from video in trecvid: a 5-year retrospective of achievements. In: Multimedia content analysis, pp 1–24
Sundaram H, Xie L, Chang S-F (2002) A utility framework for the automatic generation of audio-visual skims. In: MULTIMEDIA ’02: proceedings of the tenth ACM international conference on multimedia. ACM, New York, pp 189–198
Chapter Google Scholar
Taniguchi Y, Akutsu A, Tonomura Y, Hamada H (1995) An intuitive and efficient access interface to real-time incoming video based on automatic indexing. In: MULTIMEDIA ’95: proceedings of the third ACM international conference on multimedia. ACM, New York, pp 25–33
Chapter Google Scholar
Toklu C, Liou S-P (1999) Automatic key-frame selection for content-based video indexing and access. In: Proceedings of the conference on storage and retrieval for media databases, vol 3972, no 1. SPIE, Bellingham, pp 554–563
Google Scholar
Truong BT, Venkatesh S (2007) Video abstraction: a systematic review and classification. ACM Trans Multimedia Comput Commun Appl 3(1):1–37
Article Google Scholar
Valdés V, Martínez JM (2008) Binary tree based on-line video summarization. In: TVS ’08: proceedings of the 2nd ACM TRECVid video summarization workshop. ACM, New York, pp 134–138
Chapter Google Scholar
Valdés V, Martínez JM (2008) On-line video summarization based on signature-based junk and redundancy filtering. In: WIAMIS ’08: proceedings of the 2008 ninth international workshop on image analysis for multimedia interactive services. IEEE Computer Society, Washington, pp 88–91
Chapter Google Scholar
Valdés V, Martínez J (2010) A framework for video abstraction systems analysis and modelling from an operational point of view. Multimed Tools Appl 49(1):7–35
Article Google Scholar
Valdés V, Martínez JM (2007) On-line video skimming based on histogram similarity. In: TVS ’07: proceedings of the international workshop on TRECVID video summarization. ACM, New York, pp 94–98
Chapter Google Scholar
Valdés V, Martínez JM (2007) Post-processing techniques for on-line adaptive video summarization based on relevance curves. In: Falcidieno B, pagnuolo M, Avrithis YS, Kompatsiaris I, Buitelaar P (eds) SAMT, ser. Lecture notes in computer science, vol 4816. Springer, Berlin, pp 144–157
Google Scholar
Viola P, Jones M (2001) Rapid object detection using a boosted cascade of simple features. In: CVPR ’01: proceedings of the IEEE Computer Society conference on computer vision and pattern recognition, vol 1. IEEE Computer Society, Los Alamitos, p 511
Google Scholar
Wang M, Zhang H (2009) Video content structuring. Scholarpedia 4(8):9431
Article Google Scholar
Wang M, Hua X-S, Hong R, Tang J, Qi G-J, Song Y (2009) Unified video annotation via multigraph learning. IEEE Trans Circuits Syst Video Technol 19(5):733–746
Article Google Scholar
Wildemuth B, Marchionini G, Yang M, Geisler G, Wilkens T, Hughes A, Gruss R (2003) How fast is too fast? evaluating fast forward surrogates for digital video. In: JCDL ’03: proceedings of the joint conference on digital libraries, pp 221–230
Wilson KW, Divakaran A (2009) Discriminative genre-independent audio-visual scene change detection. In: Schettini R, Jain RC, Santini S (eds) Multimedia content access: algorithms and systems III, vol 7255, no 1. SPIE, San Jose, p 725502. Available: http://link.aip.org/link/?PSI/7255/725502/1 (Online)
Google Scholar
Xiong Z, Radhakrishnan R, Divakaran A (2003) Generation of sports highlights using motion activity in combination with a common audio feature extraction framework. In: ICIP ’03: proceedings of the 2003 international conference on image processing, vol 1, pp I-5–I-8
Yeh C, Chang M, Lu K, Shih M (2006) Robust tv news story identification via visual characteristics of anchorperson scenes. In: PSIVT ’06: proceedings of the Pacific-rim symposium on image and video technology, pp 621–630
Zhai Y, Yilmaz A, Shah M (2005) Story segmentation in news videos using visual and text cues. In: CIVR ’05: proceedings of the international conference on image and video retrieval, pp 92–102
Zhang H-J, Gong Y, Smoliar S, Tan SY (1994) Automatic parsing of news video. In: ICMCS ’94: proceedings of the international conference on multimedia computing and systems, pp 45–54
Zhang H, Wu J, Zhong D, Smoliar SW (1997) An integrated system for content-based video retrieval and browsing. Pattern Recogn 30(4):643–658
Article Google Scholar
Zhuang Y, Rui Y, Huang T, Mehrotra S (1998) Adaptive key frame extraction using unsupervised clustering. In: ICIP ’99: proceedings of the 1999 international conference on image processing, vol 1, pp 866–870

Download references

Acknowledgements

The authors want to thank Wilfried Runde and Jochen Spangenberg from Deutsche Welle for their collaboration in order to make the results of this work available on the web site. All the news bulletins used as content set for this work are © Deutsche Welle and/or respective copyright holders. (Material has been kindly provided for research/academic purposes only. Not for commercial use. No duplication, copying, re-use of any kind allowed.) Work supported by the European Commission (IST-FP6-027685—Mesh), Spanish Government (TIN2007-65400—SemanticVideo), Comunidad de Madrid (S-0505/TIC-0223 (ProMultiDis) CM), Consejería de Educación of the Comunidad de Madrid and by The European Social Fund.

Author information

Authors and Affiliations

Escuela Politécnica Superior, Universidad Autónoma de Madrid, C/Francisco Tomás y Valiente 11, 28049, Madrid, Spain
Víctor Valdés & José M. Martínez

Authors

Víctor Valdés
View author publications
You can also search for this author in PubMed Google Scholar
José M. Martínez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Víctor Valdés.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Valdés, V., Martínez, J.M. On-line video abstract generation of multimedia news. Multimed Tools Appl 59, 795–832 (2012). https://doi.org/10.1007/s11042-011-0774-5

Download citation

Published: 23 March 2011
Issue Date: August 2012
DOI: https://doi.org/10.1007/s11042-011-0774-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On-line video abstract generation of multimedia news

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

User-centred personalised video abstraction approach adopting SIFT features

Long-Text-to-Video-GAN

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On-line video abstract generation of multimedia news

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

User-centred personalised video abstraction approach adopting SIFT features

Long-Text-to-Video-GAN

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation