Segmentation of news videos based on audio-video information

De Santo, Massimo; Percannella, Gennaro; Sansone, Carlo; Vento, Mario

doi:10.1007/s10044-006-0055-5

Segmentation of news videos based on audio-video information

Theoretical Advances
Published: 28 November 2006

Volume 10, pages 135–145, (2007)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Massimo De Santo¹,
Gennaro Percannella¹,
Carlo Sansone² &
…
Mario Vento¹

130 Accesses
3 Citations
Explore all metrics

Abstract

In this paper, we propose an innovative architecture to segment a news video into the so-called “stories” by both using the included video and audio information. Segmentation of news into stories is one of the key issues for achieving efficient treatment of news-based digital libraries. While the relevance of this research problem is widely recognized in the scientific community, we are in presence of a few established solutions in the field. In our approach, the segmentation is performed in two steps: first, shots are classified by combining three different anchor shot detection algorithms using video information only. Then, the shot classification is improved by using a novel anchor shot detection method based on features extracted from the audio track. Tests on a large database confirm that the proposed system outperforms each single video-based method as well as their combination.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Kraaij W, Smeaton AF, Over P, Arlandis J “TRECVID 2004–An Overview”, TREC Video Retrieval Evaluation Online Proceedings, http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
De Santo M, Percannella G, Sansone C, Vento M (2004) “Combining experts for anchorperson shot detection in news videos”, Pattern Analysis and Applications, vol. 7 no. 4, pp. 447–460, Springer, London
De Santo M, Percannella G, Sansone C, Vento M (2004) “A Multi-Expert Approach for Shot Classification in News Videos”, Lecture Notes in Computer Science vol. 3211, Springer, Berlin, pp. 564–571
Snoek CGM, Worring M (2005) “Multimodal video indexing: a review of the state-of-the-art”. Multimedia Tools Appl 25: 5–35
Article Google Scholar
Gunsel B, Ferman AM, Tekalp AM (1996) “Video indexing through integration of syntactic and semantic features” In Proc. Workshop Applications of Computer Vision, Sarasota, FL, pp 90–95
Swanberg D, Shu CF, Jain R (1993) “Knowledge guided parsing in video databases” Proc. of SPIE Symposium on Electronic Imaging: Science and Technology, San Jose, CA, pp. 13–24
Smoliar SW, Zhang HJ, Tao SY, Gong Y (1995) “Automatic parsing and indexing of news video”. Multimedia Systems 2(6):256–265
Article Google Scholar
Hanjalic A, Lagendijk RL, Biemond J (1999) “Semi-Automatic News Analysis, Indexing, and Classification System Based on Topics Preselection”, Proc. of SPIE: Electronic Imaging: Storage and Retrieval of Image and Video Databases, San Jose (CA)
Avrithis Y, Tsapatsoulis N, Kollias S (2000) “Broadcast news parsing using visual cues: A robust face detection approach”, Proc. IEEE Int. Conf. on Multimedia and Expo, vol. 3, pp. 1469–1472
Gao X, Tang X (2002) “Unsupervised Video-Shot Segmentation and Model-Free Anchorperson Detection for News Video Story Parsing”, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 12, No. 9, pp. 765 776
Bertini M, Del Bimbo A, Pala P (2001) “Content-based indexing and retrieval of TV News”. Pattern Recognition Letters 22:503–516
Article MATH Google Scholar
Eickeler S, Muller S (1999) “Content-based video indexing of TV broadcast news using Hidden Markov Models”, Proc. IEEE International Conference on Acoustic, Speech, and Signal Processing, pp. 2997–3000
Chaisorn L, Chua TS, Lee CH (2003) “A multi-modal approach to story segmentation for news video”. World wide Web 6:187–208
Article Google Scholar
Wang C, Wang Y, Liu HY, He YX (2003) “Automatic Story Segmentation of News Video Based on Audio-Visual Features and Text Information”, Proceedings of the Second International Conference on Machine Learning and Cybernetics, Xi’an, 2–5 November, pp 3008–3011
Wei W, Gao W (2002) Automatic segmentation of news items based on video and audio features. J Comput Sci Technol 17(2):189–195
Article Google Scholar
Qi W, Gu L, Jiang H, Chen XR, Zhang HJ (2000) “Integrating Visual, Audio And Text Analysis For News Video”, 7th IEEE International Conference on Image Processing, Vancouver, British Columbia, Canada,10–13 September
Huang YS, Suen CY (1995) “A method of combining multiple experts for the recognition of unconstrained handwritten numerals”. IEEE Trans Pattern Analysis Machine Intell 17(1):90–94
Article Google Scholar
Foggia P, Sansone C, Tortorella F, Vento M (1999) “Multiclassification: Reject Criteria for the Bayesian Combiner”. Pattern Recognit Pergamon 32(8):1435–1447
Article Google Scholar
Sansone C, Tortorella F, Vento M (2001) “A Classification Reliability Driven Reject Rule for Multi-Expert Systems”. Int J Pattern Recognit Artificial Intell 15(6):885–904
Article Google Scholar
Cordella LP, Foggia P, Sansone C, Vento M (2003) “A Real-Time Text-Independent Speaker Identification System”, Proceedings of the 12th International Conference on Image Analysis and Processing, IEEE Computer Society Press, Mantova, September 17–19, pp 632–637
Xu L, Krzyzak A, Oja E (1993) “Rival penalized competitive learning for clustering analysis, RBF net and curve detection”. IEEE Trans Neural Networks 4:636–649
Article Google Scholar
Murthy HA, Beaufays F, Heck LP, Weintraub M (1999) “Robust text-independent speaker identification over telephone channels”. IEEE Trans Speech and Audio Processing 7(5):554–568
Article Google Scholar
Xu L, Krzyzak A, Suen CY (1992) “Methods of combining multiple classifiers and their application to handwritten numeral recognition”. IEEE Trans Systems, Man and Cybern 22(3):418–435
Article Google Scholar

Download references

Author information

Authors and Affiliations

Dipartimento di Ingegneria dell’Informazione e di Ingegneria Elettrica, Università di Salerno, Via Ponte don Melillo 1, 84084, Fisciano, SA, Italy
Massimo De Santo, Gennaro Percannella & Mario Vento
Dipartimento di Informatica e Sistemistica, Università di Napoli “Federico II”, Via Claudio, 21, 80125, Napoli, Italy
Carlo Sansone

Authors

Massimo De Santo
View author publications
You can also search for this author in PubMed Google Scholar
Gennaro Percannella
View author publications
You can also search for this author in PubMed Google Scholar
Carlo Sansone
View author publications
You can also search for this author in PubMed Google Scholar
Mario Vento
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gennaro Percannella.

Rights and permissions

Reprints and permissions

About this article

Cite this article

De Santo, M., Percannella, G., Sansone, C. et al. Segmentation of news videos based on audio-video information. Pattern Anal Applic 10, 135–145 (2007). https://doi.org/10.1007/s10044-006-0055-5

Download citation

Received: 15 December 2004
Accepted: 14 September 2006
Published: 28 November 2006
Issue Date: May 2007
DOI: https://doi.org/10.1007/s10044-006-0055-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Segmentation of news videos based on audio-video information

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Text-Based Video Scene Segmentation: A Novel Method to Determine Shot Boundaries

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Segmentation of news videos based on audio-video information

Abstract

Access this article

Similar content being viewed by others

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Text-Based Video Scene Segmentation: A Novel Method to Determine Shot Boundaries

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation