research-article

Towards automatic speaker retrieval for large multimedia archives

Authors:
Marijn Huijbregts

Radboud University Nijmegen, Nijmegen, Netherlands

Radboud University Nijmegen, Nijmegen, Netherlands
View Profile

,
David van Leeuwen

Radboud University Nijmegen, Nijmegen, Netherlands

Radboud University Nijmegen, Nijmegen, Netherlands
View Profile

AIEMPro '10: Proceedings of the 3rd international workshop on Automated information extraction in media productionOctober 2010Pages 15–20https://doi.org/10.1145/1877850.1877857

Published:29 October 2010Publication History

AIEMPro '10: Proceedings of the 3rd international workshop on Automated information extraction in media production

Pages 15–20

ABSTRACT

In this paper we discuss the challenges of scaling a speaker retrieval system for small audiovisual collections towards a speaker retrieval system for large audio (visual) archives. We show that with our large scale speaker diarization approach it is possible to perform query-by-example speaker retrieval; to search for audiovisual documents in which a particular person is talking. On a selection of the ICSI meeting corpus we obtain a Mean Average Precision of 0.49 and precision-at-ten of 0.70. On a much larger archive of three months of Dutch broadcast television we obtain a precision-at-ten of 0.52.

References

R. B. Dunn, D. A. Reynolds, and T. F. Quatieri. Approaches to speaker detection and tracking in conversational speech. Digital Signal Processing, 10(1-3):93--112, 2000.Google ScholarDigital Library
A. J. et all. The ICSI meeting project: Resources and research. In NIST ICASSP 2004 Meeting Recognition Workshop, Montreal, May 2004.Google Scholar
J. G. Fiscus, J. Ajot, and J. S. Garofolo. The rich transcription 2007 meeting recognition evaluation. In Multimodal Technologies for Perception of Humans, Lecture Notes in Computer Science, Berlin, 2008. Google ScholarDigital Library
O. Glembek, L. Burget, N. Dehak, N. Bròmmer, and P. Kenny. Comparison of scoring methods used in speaker recognition with joint factor analysis. In Proc ICASSP 2009, Taipei, Taiwan, April 2009. Google ScholarDigital Library
M. Huijbregts and D. van Leeuwen. The RU submission to the Evalita'09 "application track" speaker recognition evaluation. In proceedings of Evalita 2009, 2009.Google Scholar
M. Huijbregts and D. van Leeuwen. Large scale speaker diarization for long recordings and small collections. IEEE Transactions on Audio, Speech and Language Processing, submitted. Google ScholarDigital Library
M. Huijbregts, C. Wooters, and R. Ordelman. Filtering the unknown: Speech activity detection in heterogeneous video collections. In proceedings of Interspeech, Antwerp, Belgium, August 2007.Google Scholar
A. Martin and M. Przybocki. The nist 1999 speaker recognition evaluation - an overview. Digital Signal Processing, 10(1-3):1--18, 2000.Google ScholarDigital Library
D. Reynolds and P. Torres-Carrasquillo. Approaches and applications of audio diarization. pages 953--956, Philadelphia, PA, March 2005.Google Scholar
L. Rodríguez, M. Penagarikano, and G. Bordel. A simple but effective approach to speaker tracking in broadcast news. Lecture Notes in Computer Science, 2007.Google Scholar
D. van Leeuwen and M. Huijbregts. The AMI speaker diarization system for NIST RT06s meeting data. In (MLMI), volume 4299 of Lecture Notes in Computer Science, pages 371--384, Berlin, October 2007. Google ScholarDigital Library
C. Wooters and M. Huijbregts. The ICSI RT07s speaker diarization system. In Multimodal Technologies for Perception of Humans, Lecture Notes in Computer Science, Berlin, 2008. Springer Verlag. Google ScholarDigital Library
J. Zibert, B. Vesnicer, and F. Mihelic. A system for speaker detection and tracking in audio broadcast news. Informatica (Slovenia), 32(1):51--61, 2008.Google Scholar

Index Terms

Towards automatic speaker retrieval for large multimedia archives
1. Applied computing

Recommendations

Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

This paper deals with speaker-adaptive speech recognition for large spoken archives. The goal is to improve the recognition accuracy of an automatic speech recognition (ASR) system that is being deployed for transcription of a large archive of Czech ...
Read More
A review on speaker diarization systems and approaches

Speaker indexing or diarization is an important task in audio processing and retrieval. Speaker diarization is the process of labeling a speech signal with labels corresponding to the identity of speakers. This paper includes a comprehensive review on ...
Read More
Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information

Human-machine interaction in meetings requires the localization and identification of the speakers interacting with the system as well as the recognition of the words spoken. A seminal step toward this goal is the field of rich transcription research, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
AIEMPro '10: Proceedings of the 3rd international workshop on Automated information extraction in media production
October 2010
78 pages
ISBN:9781450301640
DOI:10.1145/1877850
General Chair:
Alberto Messina
RAI R&D, Italy
,
Program Chairs:
Robbie De Sutter
VRT-medialab, Belgium
,
Jean-Pierre Evain
EBU, Switzerland
,
Gerald Friedland
ICSI, USA
,
Masanori Sano
NHK R&D, Japan
Copyright © 2010 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 October 2010
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
large scale diarization
speaker diarization
speaker retrieval
speaker tracking
Qualifiers
- research-article
Conference
Upcoming Conference
MM '24

Sponsor:

sigmm

MM '24: The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 3
  Total Citations
  View Citations
- 109
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Towards automatic speaker retrieval for large multimedia archives

AIEMPro '10: Proceedings of the 3rd international workshop on Automated information extraction in media production

ABSTRACT

References

Cited By

Index Terms

Recommendations

Speaker-adaptive speech recognition using speaker diarization for improved transcription of large spoken archives

A review on speaker diarization systems and approaches

Speaker Diarization For Multiple-Distant-Microphone Meetings Using Several Sources of Information