Skip to main content
Log in

Role-based identity recognition for TV broadcasts

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Semantic queries involving image understanding aspects require the exploitation of multiple clues, namely the (inter-) relations between objects and events across multiple images, the situational context, and the application context. A prominent example for such queries is the identification of individuals in video sequences. Straightforward face recognition approaches require a model of the persons in question and tend to fail in ill-conditioned environments. Therefore, an alternative approach is to involve contextual conditions of observations in order to determine the role a person plays in the current context. Due to the strong relation between roles, persons and their identities, knowing either often allows inferring about the other. This paper presents a system that implements this approach: First, robust face detection localizes the actors in the video. By clustering similar face instances the relative frequency of their appearance within a sequence is determined. In combination with a coarse textual annotation manually created by the broadcast station’s archivist the roles and consequently the identities can be assigned and labeled in the video. Starting with unambiguous assignments and cascading, most of the persons can be identified and labeled successfully. The feasibility and performance of the role-based person identification is demonstrated on the basis of several programs of a popular German TV show, which consists of various elements like interview scenes, games and musical show acts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. Arandjelovic O, Zisserman A (2005) “Automatic face recognition for film character retrieval in feature-length films”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Diego, CA, USA, pp. 860–867

  2. Bartlett MS, Movellan JR, Sejnowski TJ (2002) Face recognition by independent component analysis. IEEE Trans Neural Network 13(6):1450–1464

    Article  Google Scholar 

  3. Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720

    Article  Google Scholar 

  4. Berg T, Berg A, Edwards J, Maire M, White R, Teh Y, Miller E, Foryth D (2004) “Names and faces in the news”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Washington, DC, USA, vol. 2, pp. 848–854

  5. Boujemaa N, Fleuret F, Gouet V, Sahbi H (2004) “Automatic textual annotation of video news based on semantic visual object extraction”. In: Proc. SPIE Storage and Retrieval Methods and Applications for Multimedia, San Jose, California, pp. 329–339

  6. Chaisorn L, Koh C, Zhao Y, Xu H, Chua T-S, Qi T (2003) “Two- level multi-modal framework for news story segmentation of large video corpus”. In: Proc. 12th Text Retrieval Conference, Gaithersburg, MD, USA

  7. Chen S, Tan X, Zhou Z-H, Zhang F (2006) Face recognition from a single image per person: a survey. IEEE Pattern Recogn 39(9):1725–1745

    Article  MATH  Google Scholar 

  8. Everingham M, Sivic J, Zisserman A. “Hello! My name is… Buffy—automatic naming of characters in TV video”. In: Proc. British Machine Vision Conference, Sept. 2006, Edinburgh

  9. Fitzgibbon AW, Zisserman A (2002) “On affine invariant clustering and automatic cast listing in movies”. In: Proc. 7th European Conference on Computer Vision, Copenhagen, pp. 304–320

  10. Gao Y, Leung MKH (2002) Face recognition using line edge map. IEEE Trans Pattern Anal Mach Intell 24(6):764–779

    Article  Google Scholar 

  11. Guillaumin M, Mensink T, Verbeek J, Schmid C (2008) “Automatic face naming with caption-based supervision”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, Anchorage, AK, USA, pp. 1–8

  12. Han S, Hutter A, Stechele W (2009) “Toward contextual forensic retrieval for visual surveillance: challenges and an architectural approach”. In: Proc. Int. Workshop on Image Analysis for Multimedia Interactive Services, London, United Kingdom, pp. 201–204

  13. He X, Yan S, Hu Y, Niyogi P, Zhang H-J (2005) Face recognition using Laplacian faces. IEEE Trans Pattern Anal Mach Intell 27(3):328–340

    Article  Google Scholar 

  14. Houghton R (1999) Named faces: putting names to faces. IEEE Intell Syst 14(5):45–50

    Article  Google Scholar 

  15. Jain V, Learned-Miller E, McCallum A (2007) “People-LDA: anchoring topics to people using face recognition”. In: Proc. IEEE Int. Conf. Computer Vision, Rio de Janeiro, pp. 1–8

  16. Javed O, Rasheed Z, Shah M (2001) “A framework for segmentation of talk & game shows”. In: Proc. Int. Conf. on Computer Vision, Vancouver, BC, Canada, pp. 532–537

  17. Jøsang A (2001) “A logic for uncertain probabilities,” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 9(3): 279–311

  18. Kirby M, Sirovich L (1990) Application of the Karhunen–Loève procedure for the characterization of human face. IEEE Trans Pattern Anal Mach Intell 12(1):103–108

    Article  Google Scholar 

  19. Kobla V, Dementhon D, Doermann D (2000) “Identifying sports videos using replay, text, and camera motion features”. In: Proc. SPIE Conference on Storage and Retrieval for Image and Video Databases, San Jose, CA, USA, pp. 332–343

  20. Kuhmunch C (1997) “On the detection and recognition of television commercials”. In: Proc. Int. Conf. on Multimedia Computing and Systems, June 3–6, Ottawa, Canada, pp. 509–516

  21. Lehane B, O'Connor NE, Murphy N (2005) “Dialogue sequence detection in movies”. In: Proc. Int. Conf. on Image and Video Retrieval 2005, Singapore, pp. 286–296

  22. Lienhart R, Pfeiffer S, Fischer S. “Automatic movie abstracting”, Universität Mannheim, Reihe Informatik 3/97

  23. Lin Y, Lin Y (2005) “Robust face detection with multi-class boosting”. In: Proc. Int. Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, pp. 680–687

  24. Ozkan D, Duygulu P (2006) “Finding people frequently appearing in news”. In: Proc. Int. Conf. Image and Video Retrieval, Tempe, AZ, USA, pp. 173–182

  25. Petersohn C (2009) “Temporal video structuring for preservation and annotation of video content”. In: Proc. IEEE Int. Conf. on Image Processing, Cairo, pp. 93–96

  26. Porikli F, Tuzel O, Meer P (2006) “Covariance tracking using model update based on lie algebra”. In: Proc. Int. Conf. on Computer Vision and Pattern Recognition, New York, NY, USA, pp. 728–735

  27. Satoh S, Kanade T (1997) “Name-it: association of face and name in video”. In: Proc. IEEE Int. Conf. Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 368–373

  28. Viola P, Jones M (2001) “Rapid object detection using a boosted cascade of simple features”. In: Proc. Int. Conference on Computer Vision and Pattern Recognition, Kauai, USA, pp. 511–518

  29. Yang J, Yan R, Hauptmann AG (2005) “Multiple instance learning for labeling faces in broadcasting news video”. In: Proc. 13th. ACM Int. Conf. Multimedia, Nov, Singapore, pp. 31–40

  30. Zhang Yi-Fan, Changsheng Xu, Hanqing Lu, Huang Y-M (2009) Character identification in feature-length films using global face-name matching. IEEE Trans Multimedia 11(7):1276–1288

    Article  Google Scholar 

  31. Zhang L, Chu R, Xiang S, Liao S, Li SZ (2007) Face detection based on multi-block LBP representation. Lect Notes Comput Sci 4642:11–18

    Article  Google Scholar 

  32. Zhang X, Gaoa Y (2009) Face recognition across pose: a review. ELSEVIER Pattern Recogn 42(11):2876–2896

    Article  Google Scholar 

  33. Zhao W, Chellappa R, Phillips PJ, Rosenfeld A (2003) Face recognition: a literature survey. ACM Comput Surv 35(4):399–459

    Article  Google Scholar 

Download references

Acknowledgments

This work has been supported by the THESEUS Program, which is funded by the German Federal Ministry of Economics and Technology. In particular, we thank our THESEUS project partner Institut für Rundfunktechnik for providing the TV program data and permission to use them for scientific purposes.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Riegel.

Additional information

Part of the content of this paper has been presented on 3rd International Workshop at the Automated Information Extraction in Media Production, AIEMPro’10, Florence 25–29 October 2010.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schwarze, T., Riegel, T., Han, S. et al. Role-based identity recognition for TV broadcasts. Multimed Tools Appl 63, 501–520 (2013). https://doi.org/10.1007/s11042-011-0834-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-011-0834-x

Keywords

Navigation