Advanced news video parsing via visual characteristics of anchorperson scenes

Dong, Yuan; Qin, Gang; Xiao, Guorui; Lian, Shiguo; Chang, Xiaofu

doi:10.1007/s11235-013-9731-0

Advanced news video parsing via visual characteristics of anchorperson scenes

Published: 12 July 2013

Volume 54, pages 247–263, (2013)
Cite this article

Telecommunication Systems Aims and scope Submit manuscript

Yuan Dong¹,
Gang Qin¹,
Guorui Xiao¹,
Shiguo Lian² &
…
Xiaofu Chang³

190 Accesses
8 Citations
Explore all metrics

Abstract

In this paper, we present an advanced news video parsing system via exploring the visual characteristics of anchorperson scenes, which aims to provide personalized news services over Internet or mobile platforms. As the anchorperson shots serve as the root shots for constructing news video, the addressed system firstly performs anchorperson detection which divides the news into several segments. Due to the manipulation of multi-features and post-processing, our method of anchorperson detection can even be efficiently applied to news video whose anchorperson scenes are most challenging and complicated. Usually, the segments produced from anchorperson detection are regarded as news stories. However, an observation in our database proves this is not true because of the existing of interview scenes. These interview scenes are showed in the form that interviewer (anchorperson) and interviewee recursively appear. Thus, a technique called interview clustering based on face similarity is carried out to merge these interview segments. Another novel aspect of our system is entity summarization of interview scenes. We adopt it in the system at final. The effectiveness and robustness of the proposed system are demonstrated by the evaluation on 19 hours of news programs from 6 different TV Channels.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

Interactive video summarization with human intentions

Article 30 June 2018

Hierarchical Extraction Algorithm of Video Summary Based on Multi-feature Similarity

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Xiong, Z., Zhou, X. S., Tian, Q., Rui, R., & Huang, T. S. (2006). Semantic retrieval of video – review of research on video retrieval in meetings, movies and broadcast news, and sports. IEEE Signal Processing Magazine, 23(2), 18–27.
Article Google Scholar
Wang, Y., Liu, Z., & Huang, J.-C. (2000). Multimedia content analysis using both audio and visual clues. IEEE Signal Processing Magazine, 17(6), 12–36.
Article Google Scholar
Hauptmann, A. G., & Witbrock, m. J. (1998). Story segmentation and detection of commercials in broadcast news video. In Proc. advances in digital libraries conf (pp. 168–179).
Google Scholar
Boykin, S., & Merlino, A. (1999). Improving broadcast news segmentation processing. In Proc. IEEE int. conf. multimedia computing and systems (Vol. 1, pp. 744–749).
Chapter Google Scholar
Liu, Z., Gibbon, D. C., & Shahraray, B. (2006). Multimedia content acquisition and processing in the MIRACLE system. In Proc. IEEE CCNC (pp. 272–276).
Google Scholar
Gibbon, D. C., Liu, Z., & Shahraray, B. (2006). The MIRACLE video search engine. In Proc. IEEE CCNC (pp. 277–281).
Google Scholar
Ohtsuki, K., Bessho, K., Matsuo, Y., Matsunaga, S., & Hayashi, Y. (2006). Automatic multimedia indexing: combining audio, speech, and visual information to index broadcast news. IEEE Signal Processing Magazine, 23(2), 69–78.
Article Google Scholar
Lian, S., & Stavroulakis, P. (2010). Introduction to special issue on secure multimedia services. Telecommunications Systems, 45(1), 1–2.
Article Google Scholar
Lian, S. (2010). Secure service convergence based on scalable media coding. Telecommunications Systems, 45(1), 21–35.
Article Google Scholar
TREC Video Retrieval Evaluation (2003). http://www-nlpir.nist.gov/projects/tv2003/tv2003.html.
TREC video retrieval evaluation (2004). http://www-nlpir.nist.gov/projects/tv2004/tv2004.html.
(1998). Topic detection and tracking evaluation (TDT-2). http://projects.ldc.upenn.edu/TDT2/.
Chua, T., Chang, S.-F., Chaisorn, L., & Hsu, W. (2004). Story boundary detection in large broadcast news video archives – techniques, experience and trends. In Proc. ACM int. conf. multimedia (MM ’04) (pp. 656–659).
Google Scholar
Allan, J., Carbonell, J., Doddington, G., Yamron, J., & Yang, Y. (1998). Topic detection and tracking pilot study final report. In Proc. DARPA broadcast news transcription and understanding workshop (pp. 194–218).
Google Scholar
Chaisorn, L., Chua, T.-S., Koh, C.-K., Zhao, Y.-L., Xu, H., Feng, H., & Tian, Q. (2003). A two-level multi-modal approach for story segmentation of large news video corpus. In Proc. TRECVID workshop.
Google Scholar
Rennert, P. (2003). StreamSage unsupervised ASR-based topic segmentation. In Proc. TRECVID workshop.
Google Scholar
Sugano, M., Hoashi, K., Mutsumato, K., Sugaya, F., & Nakajima, Y. (2003). Shot boundary determination on MPEG compressed domain and story segmentation experiments for TRECVID 2003. Notebook in TRECVID.
Hsu, W., Chang, S.-F., Huang, C.-W., Kennedy, L., Lin, C.-Y., & Iyengar, G. (2004). Discovery and fusion of salient multi-modal features towards news story segmentation. In IS&T/SPIE electronic imaging, San Jose, CA.
Google Scholar
Zhang, H., Gong, Y., Smoliar, S. W., & Tan, S. Y. (1994). Automatic parsing of news video. In Proc. int. conf. multimedia computing and systems (pp. 45–54).
Google Scholar
Avrithis, Y., Tsapatsoulis, N., & Kollias, S. (2000). Broadcast news parsing using visual cues: a robust face detection approach. In Proc. IEEE int. conf. multimedia and expo (Vol. 3, pp. 1469–1472).
Google Scholar
Smoliar, S. W., & Zhang, H.-J. (1994). Content-based video indexing and retrieval. IEEE Multimedia, 1(2), 62–72.
Article Google Scholar
Lee, H., Yu, J., Im, Y., Gil, J.-M., & Park, D. (2010). A unified scheme of shot boundary detection and anchor shot detection in news video story parsing. Multimedia Tools and Applications, 51(3), 1127–1145.
Article Google Scholar
Gao, X., & Tang, X. (2002). Unsupervised video-shot segmentation and model-free anchorperson detection for news video story parsing. IEEE Transactions on Circuits and Systems for Video Techonology, 12(9), 765–776.
Article Google Scholar
Michener, C. D., & Sokal, R. R. (1957). A quantitative approach to a problem in classification. Evolution, 11, 130–162.
Article Google Scholar
Dong, Y., & Lian, S. (2010). Automatic and fast temporal segmentation for personalized news consuming. Information Systems Frontiers. doi:10.1007/s10796-010-9256-y.
Google Scholar
Lian, S. (2011). Automatic video temporal segmentation based on multiple features. Soft Computing, 15(3), 469–482.
Article Google Scholar
Yubin, H., Yuan, D., Chengyu, D., & Haila, W. (2009). A novel audiovisual analysis for news video indexing. In IEEE 2nd international conference on broadband network & multimedia technology (pp. 486–490).
Google Scholar
Tan, X., & Triggs, B. (2010). Enhanced local texture feature sets for face recognition under difficult lighting conditions. IEEE Transactions on Image Processing, 19(6), 1635–1650.
Article Google Scholar
Daugman, J. (1988). Complete discrete 2-d Gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(7), 1169–1179.
Article Google Scholar
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 2(60), 91–110.
Article Google Scholar
Fischler, M. A., & Bolles, R. C. (1981). Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24, 381–395.
Article Google Scholar
Haller, M., Kim, H.-G., & Sikora, T. (2006). Audiovisual anchorperson detection for topic-oriented navigation in broadcast news. In I. Press (Ed.), IEEE 7th international conference on multimedia & expo (ICME 2006) (pp. 1817–1820).
Chapter Google Scholar

Download references

Acknowledgements

This work was supported by both Invenio project launched by France Telecom R&D (Orange Labs), and The Key Project of The National Natural Science Foundation of China (90920001).

Author information

Authors and Affiliations

Beijing University of Posts and Telecommunications, Beijing, China
Yuan Dong, Gang Qin & Guorui Xiao
Corporate Research, Huawei Tech Co., Shenzhen, China
Shiguo Lian
France Telecom R&D, Beijing, China
Xiaofu Chang

Authors

Yuan Dong
View author publications
You can also search for this author in PubMed Google Scholar
Gang Qin
View author publications
You can also search for this author in PubMed Google Scholar
Guorui Xiao
View author publications
You can also search for this author in PubMed Google Scholar
Shiguo Lian
View author publications
You can also search for this author in PubMed Google Scholar
Xiaofu Chang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yuan Dong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dong, Y., Qin, G., Xiao, G. et al. Advanced news video parsing via visual characteristics of anchorperson scenes. Telecommun Syst 54, 247–263 (2013). https://doi.org/10.1007/s11235-013-9731-0

Download citation

Published: 12 July 2013
Issue Date: November 2013
DOI: https://doi.org/10.1007/s11235-013-9731-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Advanced news video parsing via visual characteristics of anchorperson scenes

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

Interactive video summarization with human intentions

Hierarchical Extraction Algorithm of Video Summary Based on Multi-feature Similarity

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now