Word Alignment in Digital Talking Books Using WFSTs

Serralheiro, António; Caseiro, Diamantino; Meinedo, Hugo; Trancoso, Isabel

doi:10.1007/3-540-45747-X_37

António Serralheiro⁶,
Diamantino Caseiro⁶,
Hugo Meinedo⁶ &
…
Isabel Trancoso⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2458))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1685 Accesses

Abstract

This paper describes the motivation and the method that we used for aligning digital spoken books, and the results obtained both at a word level and at a phone level. This alignment will allow specific access interfaces for persons with special needs, and also tools for easily detecting and indexing units (words, sentences, topics) in the spoken books. The tool was implemented in a Weighted Finite State Transducer framework, which provides an efficient way to combine different types of knowledge sources, such as alternative pronunciation rules. With this tool, a 2-hour long spoken book was aligned in a single step in much less than real time.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

ANSI/NISO Z39.86— 2002 Specifications for the Digital Talking Book, http://www.niso.org/standards/index.html
Neto, J., Martins, C. and Almeida, L., A Large Vocabulary Continuous Speech Recognition Hybrid System for the Portuguese Language, in Proc. ICSLP 98, Sydney, Australia, 1998.
Google Scholar
H. Meinedo and J. Neto, “Combination of acoustic models in continuous speech recognition hybridsystems”, In Proc. ICSLP 2000, Beijing, China, 2000.
Google Scholar
H. Hermansky, N. Morgan, A. Baya and P. Kohn, “RASTA-PLP Speech Analysis Technique”, In Proc. ICASSP 92, San Francisco, USA, 1992.
Google Scholar
B. E. Kingsbury, N. Morgan, and S. Greenberg, “Robust speech recognition using the modulation spectrogram”, Speech Communication, 25:117–132, 1998.
Article Google Scholar
M. Mohri, M. Riley, D. Hindle, A. Ljolje, F. Pereira, “Full Expansion of Context-Dependent Networks in Large Vocabulary Speech Recognition”, In Proc. ICASSP 98, Seattle, Washington, 1998.
Google Scholar
C. Ribeiro, I. Trancoso and M. Viana, EUROM.1 Portuguese Database, Report of ESPRIT Project 6819 SAM-A, 1993.
Google Scholar

Download references

Author information

Authors and Affiliations

L2F Spoken Language Systems Lab., INESC-ID/IST, Rua Alves Redol 9, 1000-029, Lisbon, Portugal
António Serralheiro, Diamantino Caseiro, Hugo Meinedo & Isabel Trancoso

Authors

António Serralheiro
View author publications
You can also search for this author in PubMed Google Scholar
Diamantino Caseiro
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Meinedo
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Trancoso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Information Engineering, University of Padua, Via Gradenigo 6/a, 35131, Padova, Italy
Maristella Agosti
Istituto di Scienza e Tecnologie dell’ Informazione (ISTI-CNR), Area della Ricerca CNR di Pisa, Via G. Moruzzi 1, 56124, Pisa, Italy
Costantino Thanos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Serralheiro, A., Caseiro, D., Meinedo, H., Trancoso, I. (2002). Word Alignment in Digital Talking Books Using WFSTs. In: Agosti, M., Thanos, C. (eds) Research and Advanced Technology for Digital Libraries. ECDL 2002. Lecture Notes in Computer Science, vol 2458. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45747-X_37

Download citation

DOI: https://doi.org/10.1007/3-540-45747-X_37
Published: 13 September 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44178-6
Online ISBN: 978-3-540-45747-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics