Conferences >2003 IEEE International Confe...

Towards automatic transcription of large spoken archives - English ASR for the MALACH project

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

Digital archives have emerged as the pre-eminent method for capturing the human experience. Before such archives can be used efficiently, their contents must be described...Show More

Metadata

Abstract:

Digital archives have emerged as the pre-eminent method for capturing the human experience. Before such archives can be used efficiently, their contents must be described. The NSF-funded MALACH project aims to provide improved access to large spoken archives by advancing the state-of-the-art in automated speech recognition (ASR), Information Retrieval (IR) and related technologies [1,2] for multiple languages. This paper describes the ASR research for the English speech in the MALACH corpus. The MALACH corpus consists of unconstrained, natural speech filled with disfluencies, heavy accents, age-related coarticulation, uncued speaker and language switching, and emotional speech collected in the form of interviews from over 52000 speakers in 32 languages. In this paper, we describe this new testbed for developing speech recognition algorithms and report on the performance of well-known techniques for building better acoustic models for the speaking styles seen in this corpus. The best English ASR system to date has a word error rate of 43.8% on this corpus.

Published in: 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03).

Date of Conference: 06-10 April 2003

Date Added to IEEE Xplore: 21 May 2003

Print ISBN:0-7803-7663-3

Print ISSN: 1520-6149

DOI: 10.1109/ICASSP.2003.1198756

Conference Location: Hong Kong

Contents

References is not available for this document.

Towards automatic transcription of large spoken archives - English ASR for the MALACH project

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?

Towards automatic transcription of large spoken archives - English ASR for the MALACH project

Alerts

Abstract:

Metadata

Abstract:

References

IEEE Account

Purchase Details

Profile Information

Need Help?