An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation

Kim, Sang-Kyun; Hwang, Doo Sun; Kim, Ji-Yeun; Seo, Yang-Seock

doi:10.1007/11526346_31

Sang-Kyun Kim²¹,
Doo Sun Hwang²¹,
Ji-Yeun Kim²¹ &
…
Yang-Seock Seo²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3568))

Included in the following conference series:

International Conference on Image and Video Retrieval

1176 Accesses
6 Citations

Abstract

A multi-modal method to improve the performance of the anchorperson shot detection for news story segmentation is proposed in this paper. The anchorperson voice information is used for the verification of anchorperson shot candidates extracted by visual information. The algorithm starts with the anchorperson voice shot candidate extraction using time and silence condition. The anchorperson templates are generated from the anchorperson face and cloth information from the anchorperson voice shots extracted. The anchorperson voice models are then created after segregating anchorperson voice shots containing 2 or more voices. The anchorperson voice model verifies the anchorperson shot candidates obtained from visual information. 720 minutes of news programs are tested and experimental results are demonstrated.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Enhanced video temporal segmentation using a Siamese network with multimodal features

Article 07 July 2023

References

Zhang, H., Gong, Y., Smoliar, S.W., Tan, S.Y.: Automatic parsing of news video. In: Proceedings of the International Conference on Multimedia Computing and Systems, pp. 45–54 (1994)
Google Scholar
Hanjalic, A., Lagensijk, R.L., Biemond, J.: Template-based Detection of Anchorperson Shots in News Program. In: Proceedings of 1998 International Conference on Image Processing, ICIP 1998, vol. 3, pp. 148–152 (1998)
Google Scholar
Choi, J., Jeong, D.: Storyboard construction using segmentation of MPEG encoded news video. In: Proceedings of the 43rd IEEE Midwest Symposium on Circuits and Systems, vol. 2, pp. 758–761 (2000)
Google Scholar
Bertini, M., Del Bimbo, A., Pala, P.: Content based indexing and retrieval of TV news. Pattern Recognition Letter 22, 503–516 (2001)
Article MATH Google Scholar
Gao, X., Li, J., Yang, B.: A Graph-Theoretical Clustering based Anchorperson Shot Detection for news Video Indexing. In: ICCIMA 2003 (2003)
Google Scholar
Nakajima, Y., Yamguchi, D., Kato, H., Yanagihara, H., Hatori, Y.: Automatic anchorperson detection from an MPEG coded TV program. In: International Conference on Consumer Electronics, ICCE 2002. Digest of Technical Papers, pp. 122–123 (2002)
Google Scholar
Irii, H., Itoh, K., Kitawaki, N.: Multi-lingual speech database for speech quality measurements and its statistic characteristic. Trans. Committee on Speech Research, Acoust. Soc. Jap., S87-69 (1987)
Google Scholar
Furui, S.: Digital Speech Processing, Synthesis, and Recognition. Marcel Dekker, New York (1989)
Google Scholar
Li, S.Z., Zhu, L., Zhang, Z., Zhang, H.: Learning to detect multi-view faces in real-time, Development and Learning. In: The 2nd International Conference on Proceedings, pp. 172–177 (2002)
Google Scholar
Guo, G.-D., Zhang, H.-J., Li, S.Z.: Pairwise face recognition. In: ICCV 2001, vol. 2, pp. 282–287 (2001)
Google Scholar
Land, E.H., McCann, J.J.: Lightness and retinex theory. Journal of the Optical Society of America 61, 1–11 (1971)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computing Lab., Digital Research Center, Samsung A.I.T., San 14-1 Nongseo-ri Kiheung-eup, Yongin-si, 449-712, Republic of Korea
Sang-Kyun Kim, Doo Sun Hwang, Ji-Yeun Kim & Yang-Seock Seo

Authors

Sang-Kyun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Doo Sun Hwang
View author publications
You can also search for this author in PubMed Google Scholar
Ji-Yeun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Yang-Seock Seo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science, National University of Singapore, Computing 1, 117590, Singapore
Wee-Kheng Leow
LIACS Media Lab, Leiden University,
Michael S. Lew & Erwin M. Bakker &
National University of Singapore, 3 Science Dr, 117543, Singapore
Tat-Seng Chua
Microsoft Research Asia, 4F, Sigma Center, No.49, Zhichun Road, 100080, Beijing, P.R.China
Wei-Ying Ma
School of Computing, National University of Singapore, 3 Science Drive 2, 117543, Singapore
Lekha Chaisorn

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kim, SK., Hwang, D.S., Kim, JY., Seo, YS. (2005). An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation. In: Leow, WK., Lew, M.S., Chua, TS., Ma, WY., Chaisorn, L., Bakker, E.M. (eds) Image and Video Retrieval. CIVR 2005. Lecture Notes in Computer Science, vol 3568. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11526346_31

Download citation

DOI: https://doi.org/10.1007/11526346_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-27858-0
Online ISBN: 978-3-540-31678-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Enhanced video temporal segmentation using a Siamese network with multimodal features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

An Effective News Anchorperson Shot Detection Method Based on Adaptive Audio/Visual Model Generation

Abstract

Access this chapter

Preview

Similar content being viewed by others

Automatic Fast Detection of Anchorperson Shots in Temporally Aggregated TV News Videos

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

Enhanced video temporal segmentation using a Siamese network with multimodal features

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation