Role-based identity recognition for TV broadcasts

Schwarze, Tobias; Riegel, Thomas; Han, Seunghan; Hutter, Andreas; Nowak, Stefanie; Ebel, Sascha; Petersohn, Christian; Ndjiki-Nya, Patrick

doi:10.1007/s11042-011-0834-x

Role-based identity recognition for TV broadcasts

Published: 20 July 2011

Volume 63, pages 501–520, (2013)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Tobias Schwarze¹,
Thomas Riegel¹,
Seunghan Han¹,
Andreas Hutter¹,
Stefanie Nowak²,
Sascha Ebel³,
Christian Petersohn³ &
…
Patrick Ndjiki-Nya³

312 Accesses
1 Citation
Explore all metrics

Abstract

Semantic queries involving image understanding aspects require the exploitation of multiple clues, namely the (inter-) relations between objects and events across multiple images, the situational context, and the application context. A prominent example for such queries is the identification of individuals in video sequences. Straightforward face recognition approaches require a model of the persons in question and tend to fail in ill-conditioned environments. Therefore, an alternative approach is to involve contextual conditions of observations in order to determine the role a person plays in the current context. Due to the strong relation between roles, persons and their identities, knowing either often allows inferring about the other. This paper presents a system that implements this approach: First, robust face detection localizes the actors in the video. By clustering similar face instances the relative frequency of their appearance within a sequence is determined. In combination with a coarse textual annotation manually created by the broadcast station’s archivist the roles and consequently the identities can be assigned and labeled in the video. Starting with unambiguous assignments and cascading, most of the persons can be identified and labeled successfully. The feasibility and performance of the role-based person identification is demonstrated on the basis of several programs of a popular German TV show, which consists of various elements like interview scenes, games and musical show acts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Interactive Face Labeling System in Real-World Videos

Face-Based People Searching in Videos

Interaction Design of a Semi-automatic Video Face Annotation System

References

Arandjelovic O, Zisserman A (2005) “Automatic face recognition for film character retrieval in feature-length films”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Diego, CA, USA, pp. 860–867
Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Network 13(6):1450–1464
Article Google Scholar
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Berg T, Berg A, Edwards J, Maire M, White R, Teh Y, Miller E, Foryth D (2004) “Names and faces in the news”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Washington, DC, USA, vol. 2, pp. 848–854
Boujemaa N, Fleuret F, Gouet V, Sahbi H (2004) “Automatic textual annotation of video news based on semantic visual object extraction”. In: Proc. SPIE Storage and Retrieval Methods and Applications for Multimedia, San Jose, California, pp. 329–339
Chaisorn L, Koh C, Zhao Y, Xu H, Chua T-S, Qi T (2003) “Two- level multi-modal framework for news story segmentation of large video corpus”. In: Proc. 12th Text Retrieval Conference, Gaithersburg, MD, USA
Chen S, Tan X, Zhou Z-H, Zhang F (2006) Face recognition from a single image per person: a survey. IEEE Pattern Recogn 39(9):1725–1745
Article MATH Google Scholar
Everingham M, Sivic J, Zisserman A. “Hello! My name is… Buffy—automatic naming of characters in TV video”. In: Proc. British Machine Vision Conference, Sept. 2006, Edinburgh
Fitzgibbon AW, Zisserman A (2002) “On affine invariant clustering and automatic cast listing in movies”. In: Proc. 7th European Conference on Computer Vision, Copenhagen, pp. 304–320
Gao Y, Leung MKH (2002) Face recognition using line edge map. IEEE Trans Pattern Anal Mach Intell 24(6):764–779
Article Google Scholar
Guillaumin M, Mensink T, Verbeek J, Schmid C (2008) “Automatic face naming with caption-based supervision”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Anchorage, AK, USA, pp. 1–8
Han S, Hutter A, Stechele W (2009) “Toward contextual forensic retrieval for visual surveillance: challenges and an architectural approach”. In: Proc. Int. Workshop on Image Analysis for Multimedia Interactive Services, London, United Kingdom, pp. 201–204
He X, Yan S, Hu Y, Niyogi P, Zhang H-J (2005) Face recognition using Laplacian faces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340
Article Google Scholar
Houghton R (1999) Named faces: putting names to faces. IEEE Intell Syst 14(5):45–50
Article Google Scholar
Jain V, Learned-Miller E, McCallum A (2007) “People-LDA: anchoring topics to people using face recognition”. In: Proc. IEEE Int. Conf. Computer Vision, Rio de Janeiro, pp. 1–8
Javed O, Rasheed Z, Shah M (2001) “A framework for segmentation of talk & game shows”. In: Proc. Int. Conf. on Computer Vision, Vancouver, BC, Canada, pp. 532–537
Jøsang A (2001) “A logic for uncertain probabilities,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3): 279–311
Kirby M, Sirovich L (1990) Application of the Karhunen–Loève procedure for the characterization of human face. IEEE Trans Pattern Anal Mach Intell 12(1):103–108
Article Google Scholar
Kobla V, Dementhon D, Doermann D (2000) “Identifying sports videos using replay, text, and camera motion features”. In: Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases, San Jose, CA, USA, pp. 332–343
Kuhmunch C (1997) “On the detection and recognition of television commercials”. In: Proc. Int. Conf. on Multimedia Computing and Systems, June 3–6, Ottawa, Canada, pp. 509–516
Lehane B, O'Connor NE, Murphy N (2005) “Dialogue sequence detection in movies”. In: Proc. Int. Conf. on Image and Video Retrieval 2005, Singapore, pp. 286–296
Lienhart R, Pfeiffer S, Fischer S. “Automatic movie abstracting”, Universität Mannheim, Reihe Informatik 3/97
Lin Y, Lin Y (2005) “Robust face detection with multi-class boosting”. In: Proc. Int. Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, pp. 680–687
Ozkan D, Duygulu P (2006) “Finding people frequently appearing in news”. In: Proc. Int. Conf. Image and Video Retrieval, Tempe, AZ, USA, pp. 173–182
Petersohn C (2009) “Temporal video structuring for preservation and annotation of video content”. In: Proc. IEEE Int. Conf. on Image Processing, Cairo, pp. 93–96
Porikli F, Tuzel O, Meer P (2006) “Covariance tracking using model update based on lie algebra”. In: Proc. Int. Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, pp. 728–735
Satoh S, Kanade T (1997) “Name-it: association of face and name in video”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 368–373
Viola P, Jones M (2001) “Rapid object detection using a boosted cascade of simple features”. In: Proc. Int. Conference on Computer Vision and Pattern Recognition, Kauai, USA, pp. 511–518
Yang J, Yan R, Hauptmann AG (2005) “Multiple instance learning for labeling faces in broadcasting news video”. In: Proc. 13th. ACM Int. Conf. Multimedia, Nov, Singapore, pp. 31–40
Zhang Yi-Fan, Changsheng Xu, Hanqing Lu, Huang Y-M (2009) Character identification in feature-length films using global face-name matching. IEEE Trans Multimedia 11(7):1276–1288
Article Google Scholar
Zhang L, Chu R, Xiang S, Liao S, Li SZ (2007) Face detection based on multi-block LBP representation. Lect Notes Comput Sci 4642:11–18
Article Google Scholar
Zhang X, Gaoa Y (2009) Face recognition across pose: a review. ELSEVIER Pattern Recogn 42(11):2876–2896
Article Google Scholar
Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition: a literature survey. ACM Comput Surv 35(4):399–459
Article Google Scholar

Download references

Acknowledgments

This work has been supported by the THESEUS Program, which is funded by the German Federal Ministry of Economics and Technology. In particular, we thank our THESEUS project partner Institut für Rundfunktechnik for providing the TV program data and permission to use them for scientific purposes.

Author information

Authors and Affiliations

Siemens AG, Corporate Technology, Otto-Hahn-Ring 6, 80200, Munich, Germany
Tobias Schwarze, Thomas Riegel, Seunghan Han & Andreas Hutter
Fraunhofer Institute for Digital Media Technology, Ehrenbergstrasse 31, 98693, Ilmenau, Germany
Stefanie Nowak
Fraunhofer Institute for Telecommunications, Einsteinufer 37, 10587, Berlin, Germany
Sascha Ebel, Christian Petersohn & Patrick Ndjiki-Nya

Authors

Tobias Schwarze
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Riegel
View author publications
You can also search for this author in PubMed Google Scholar
Seunghan Han
View author publications
You can also search for this author in PubMed Google Scholar
Andreas Hutter
View author publications
You can also search for this author in PubMed Google Scholar
Stefanie Nowak
View author publications
You can also search for this author in PubMed Google Scholar
Sascha Ebel
View author publications
You can also search for this author in PubMed Google Scholar
Christian Petersohn
View author publications
You can also search for this author in PubMed Google Scholar
Patrick Ndjiki-Nya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Riegel.

Additional information

Part of the content of this paper has been presented on 3rd International Workshop at the Automated Information Extraction in Media Production, AIEMPro’10, Florence 25–29 October 2010.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schwarze, T., Riegel, T., Han, S. et al. Role-based identity recognition for TV broadcasts. Multimed Tools Appl 63, 501–520 (2013). https://doi.org/10.1007/s11042-011-0834-x

Download citation

Published: 20 July 2011
Issue Date: March 2013
DOI: https://doi.org/10.1007/s11042-011-0834-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Role-based identity recognition for TV broadcasts

Abstract

Access this article

Similar content being viewed by others

Interactive Face Labeling System in Real-World Videos

Face-Based People Searching in Videos

Interaction Design of a Semi-automatic Video Face Annotation System

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Role-based identity recognition for TV broadcasts

Abstract

Access this article

Similar content being viewed by others

Interactive Face Labeling System in Real-World Videos

Face-Based People Searching in Videos

Interaction Design of a Semi-automatic Video Face Annotation System

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation