skip to main content
10.1145/1877850.1877859acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Role-based identity recognition for telecasts

Published: 29 October 2010 Publication History

Abstract

Semantic queries involving image understanding aspects require the exploitation of multiple clues, namely the (inter-)relations between objects and events across multiple images, the situational context, and the application context. A prominent example for such queries is the identification of individuals in video sequences. Straightforward face recognition approaches require a model of the persons in question and tend to fail in ill conditioned environments. Therefore, an alternative approach is to involve contextual conditions of observations in order to determine the role a person plays in the current context. Due to the strong relation between roles, persons and their identities, knowing either often allows inferring about the other.
This paper presents a system that implements this approach: First, robust face detection localizes the actors in the video. By clustering similar face instances the relative frequency of their appearance within a sequence is determined. In combination with a coarse textual annotation manually created by the broadcast station's archivist the roles and consequently the identities can be assigned and labeled in the video. Starting with unambiguous assignments and cascading appropriately most of the persons can be identified and labeled successfully. The feasibility and performance of the role-based person identification is demonstrated on basis of several programs of a popular German TV show, which consists of various elements like interview scenes, games and musical show acts.

References

[1]
M. Everingham, J. Sivic, and A. Zisserman. Hello! My name is... Buffy - Automatic naming of characters in tv video. BMWC, 2006.
[2]
S. Han, A. Hutter, and W. Stechele. Toward contextual forensic retrieval for visual surveillance: Challenges and an architectural approach. WIAMIS, 2009.
[3]
O. Javed, Z. Rasheed, and M. Shah. A framework for segmentation of talk & game shows. ICCV, 2001.
[4]
V. Kobla, D. Dementhon, and D. Doermann. Identifying sports videos using replay, text, and camera motion features. SPIE proceedings series, 2000.
[5]
C. Kuhmunch. On the detection and recognition of television commercials. ICMCS, 1997.
[6]
C. Petersohn. Temporal video structuring for preservation and annotation of video content. ICIP, 2009.
[7]
F. Porikli, O. Tuzel, and P. Meer. Covariance tracking using model update based on lie algebra. CVPR, 2006.
[8]
L. Zhang, R. Chu, S. Xiang, S. Liao, and S. Z. Li. Face detection based on multi-block lbp representation. Lecture Notes in Computer Science, 2007.
[9]
W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld. Face recognition: A literature survey. ACM Computer Surveys, 2003.

Cited By

View all
  • (2014)A conditional random field approach for audio-visual people diarization2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6853569(116-120)Online publication date: May-2014
  • (2010)3rd international workshop on automated information extraction in media productionProceedings of the 18th ACM international conference on Multimedia10.1145/1873951.1874353(1751-1752)Online publication date: 25-Oct-2010

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AIEMPro '10: Proceedings of the 3rd international workshop on Automated information extraction in media production
October 2010
78 pages
ISBN:9781450301640
DOI:10.1145/1877850
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. clustering
  2. face localization
  3. identity recognition
  4. metadata
  5. searching
  6. shot detection
  7. television programs
  8. temporal segmentation

Qualifiers

  • Research-article

Conference

MM '10
Sponsor:
MM '10: ACM Multimedia Conference
October 29, 2010
Firenze, Italy

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 28 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2014)A conditional random field approach for audio-visual people diarization2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP.2014.6853569(116-120)Online publication date: May-2014
  • (2010)3rd international workshop on automated information extraction in media productionProceedings of the 18th ACM international conference on Multimedia10.1145/1873951.1874353(1751-1752)Online publication date: 25-Oct-2010

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media