Skip to main content
Log in

Advanced news video parsing via visual characteristics of anchorperson scenes

  • Published:
Telecommunication Systems Aims and scope Submit manuscript

Abstract

In this paper, we present an advanced news video parsing system via exploring the visual characteristics of anchorperson scenes, which aims to provide personalized news services over Internet or mobile platforms. As the anchorperson shots serve as the root shots for constructing news video, the addressed system firstly performs anchorperson detection which divides the news into several segments. Due to the manipulation of multi-features and post-processing, our method of anchorperson detection can even be efficiently applied to news video whose anchorperson scenes are most challenging and complicated. Usually, the segments produced from anchorperson detection are regarded as news stories. However, an observation in our database proves this is not true because of the existing of interview scenes. These interview scenes are showed in the form that interviewer (anchorperson) and interviewee recursively appear. Thus, a technique called interview clustering based on face similarity is carried out to merge these interview segments. Another novel aspect of our system is entity summarization of interview scenes. We adopt it in the system at final. The effectiveness and robustness of the proposed system are demonstrated by the evaluation on 19 hours of news programs from 6 different TV Channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

References

  1. Xiong, Z., Zhou, X. S., Tian, Q., Rui, R., & Huang, T. S. (2006). Semantic retrieval of video – review of research on video retrieval in meetings, movies and broadcast news, and sports. IEEE Signal Processing Magazine, 23(2), 18–27.

    Article  Google Scholar 

  2. Wang, Y., Liu, Z., & Huang, J.-C. (2000). Multimedia content analysis using both audio and visual clues. IEEE Signal Processing Magazine, 17(6), 12–36.

    Article  Google Scholar 

  3. Hauptmann, A. G., & Witbrock, m. J. (1998). Story segmentation and detection of commercials in broadcast news video. In Proc. advances in digital libraries conf (pp. 168–179).

    Google Scholar 

  4. Boykin, S., & Merlino, A. (1999). Improving broadcast news segmentation processing. In Proc. IEEE int. conf. multimedia computing and systems (Vol. 1, pp. 744–749).

    Chapter  Google Scholar 

  5. Liu, Z., Gibbon, D. C., & Shahraray, B. (2006). Multimedia content acquisition and processing in the MIRACLE system. In Proc. IEEE CCNC (pp. 272–276).

    Google Scholar 

  6. Gibbon, D. C., Liu, Z., & Shahraray, B. (2006). The MIRACLE video search engine. In Proc. IEEE CCNC (pp. 277–281).

    Google Scholar 

  7. Ohtsuki, K., Bessho, K., Matsuo, Y., Matsunaga, S., & Hayashi, Y. (2006). Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news. IEEE Signal Processing Magazine, 23(2), 69–78.

    Article  Google Scholar 

  8. Lian, S., & Stavroulakis, P. (2010). Introduction to special issue on secure multimedia services. Telecommunications Systems, 45(1), 1–2.

    Article  Google Scholar 

  9. Lian, S. (2010). Secure service convergence based on scalable media coding. Telecommunications Systems, 45(1), 21–35.

    Article  Google Scholar 

  10. TREC Video Retrieval Evaluation (2003). http://www-nlpir.nist.gov/projects/tv2003/tv2003.html.

  11. TREC video retrieval evaluation (2004). http://www-nlpir.nist.gov/projects/tv2004/tv2004.html.

  12. (1998). Topic detection and tracking evaluation (TDT-2). http://projects.ldc.upenn.edu/TDT2/.

  13. Chua, T., Chang, S.-F., Chaisorn, L., & Hsu, W. (2004). Story boundary detection in large broadcast news video archives – techniques, experience and trends. In Proc. ACM int. conf. multimedia (MM ’04) (pp. 656–659).

    Google Scholar 

  14. Allan, J., Carbonell, J., Doddington, G., Yamron, J., & Yang, Y. (1998). Topic detection and tracking pilot study final report. In Proc. DARPA broadcast news transcription and understanding workshop (pp. 194–218).

    Google Scholar 

  15. Chaisorn, L., Chua, T.-S., Koh, C.-K., Zhao, Y.-L., Xu, H., Feng, H., & Tian, Q. (2003). A two-level multi-modal approach for story segmentation of large news video corpus. In Proc. TRECVID workshop.

    Google Scholar 

  16. Rennert, P. (2003). StreamSage unsupervised ASR-based topic segmentation. In Proc. TRECVID workshop.

    Google Scholar 

  17. Sugano, M., Hoashi, K., Mutsumato, K., Sugaya, F., & Nakajima, Y. (2003). Shot boundary determination on MPEG compressed domain and story segmentation experiments for TRECVID 2003. Notebook in TRECVID.

  18. Hsu, W., Chang, S.-F., Huang, C.-W., Kennedy, L., Lin, C.-Y., & Iyengar, G. (2004). Discovery and fusion of salient multi-modal features towards news story segmentation. In IS&T/SPIE electronic imaging, San Jose, CA.

    Google Scholar 

  19. Zhang, H., Gong, Y., Smoliar, S. W., & Tan, S. Y. (1994). Automatic parsing of news video. In Proc. int. conf. multimedia computing and systems (pp. 45–54).

    Google Scholar 

  20. Avrithis, Y., Tsapatsoulis, N., & Kollias, S. (2000). Broadcast news parsing using visual cues: a robust face detection approach. In Proc. IEEE int. conf. multimedia and expo (Vol. 3, pp. 1469–1472).

    Google Scholar 

  21. Smoliar, S. W., & Zhang, H.-J. (1994). Content-based video indexing and retrieval. IEEE Multimedia, 1(2), 62–72.

    Article  Google Scholar 

  22. Lee, H., Yu, J., Im, Y., Gil, J.-M., & Park, D. (2010). A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimedia Tools and Applications, 51(3), 1127–1145.

    Article  Google Scholar 

  23. Gao, X., & Tang, X. (2002). Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing. IEEE Transactions on Circuits and Systems for Video Techonology, 12(9), 765–776.

    Article  Google Scholar 

  24. Michener, C. D., & Sokal, R. R. (1957). A quantitative approach to a problem in classification. Evolution, 11, 130–162.

    Article  Google Scholar 

  25. Dong, Y., & Lian, S. (2010). Automatic and fast temporal segmentation for personalized news consuming. Information Systems Frontiers. doi:10.1007/s10796-010-9256-y.

    Google Scholar 

  26. Lian, S. (2011). Automatic video temporal segmentation based on multiple features. Soft Computing, 15(3), 469–482.

    Article  Google Scholar 

  27. Yubin, H., Yuan, D., Chengyu, D., & Haila, W. (2009). A novel audiovisual analysis for news video indexing. In IEEE 2nd international conference on broadband network & multimedia technology (pp. 486–490).

    Google Scholar 

  28. Tan, X., & Triggs, B. (2010). Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Transactions on Image Processing, 19(6), 1635–1650.

    Article  Google Scholar 

  29. Daugman, J. (1988). Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(7), 1169–1179.

    Article  Google Scholar 

  30. Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2(60), 91–110.

    Article  Google Scholar 

  31. Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24, 381–395.

    Article  Google Scholar 

  32. Haller, M., Kim, H.-G., & Sikora, T. (2006). Audiovisual anchorperson detection for topic-oriented navigation in broadcast news. In I. Press (Ed.), IEEE 7th international conference on multimedia & expo (ICME 2006) (pp. 1817–1820).

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by both Invenio project launched by France Telecom R&D (Orange Labs), and The Key Project of The National Natural Science Foundation of China (90920001).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yuan Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Qin, G., Xiao, G. et al. Advanced news video parsing via visual characteristics of anchorperson scenes. Telecommun Syst 54, 247–263 (2013). https://doi.org/10.1007/s11235-013-9731-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11235-013-9731-0

Keywords

Navigation