Skip to main content

Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework

  • Conference paper
Text, Speech, and Dialogue (TSD 2013)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8082))

Included in the following conference series:

  • 2515 Accesses

Abstract

The paper presents a technique for phonetic spoken term detection in large audio archive. It is designed within the framework of weighted finite-state transducers and utilizes the rather recently developed notion of factor automata, which we have enhanced with a score normalization and a technique for systematic query expansion which allows for phone deletions and substitutions and consequently compensates for frequent pronunciation imperfections and systematic phoneme interchanges occurring during the ASR decoding process. The experiments presented in the paper show that the new WFST-based method outperforms the baseline system both in terms of search performance and speed. Finally, the paper discusses the issues of the proposed techniques that need to be addressed before the application in real-life tasks.

This research was supported by the Ministry of Culture Czech Republic, project No. DF12P01OVV022.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Psutka, J., Švec, J., Psutka, J., Vaněk, J., Pražák, A., Šmídl, L., Ircing, P.: System for fast lexical and phonetic spoken term detection in a Czech cultural heritage archive. EURASIP Journal on Audio, Speech, and Music Processing 2011(1), 10 (2011)

    Article  Google Scholar 

  2. Byrne, W., Doermann, D., Franz, M., Gustman, S., Hajič, J., Oard, D., Picheny, M., Psutka, J., Ramabhadran, B., Soergel, D., Ward, T., Zhu, W.J.: Automatic Recognition of Spontaneous Speech for Access to Multilingual Oral History Archives. IEEE Transactions on Speech and Audio Processing 12(4), 420–435 (2004)

    Article  Google Scholar 

  3. Can, D., Saraclar, M.: Lattice indexing for spoken term detection. IEEE Transactions on Audio, Speech, and Language Processing 19(8), 2338–2347 (2011)

    Article  Google Scholar 

  4. Mohri, M., Moreno, P., Weinstein, E.: Factor automata of automata and applications. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 168–179. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

  5. Allauzen, C., Mohri, M., Saraclar, M.: General indexation of weighted automata - application to spoken utterance retrieval. In: Ramabhadran, B., Douglas, O. (eds.) HLT-NAACL 2004 Workshop: Interdisciplinary Approaches to Speech Indexing and Retrieval, pp. 33–40. Association for Computational Linguistics, Boston (2004)

    Chapter  Google Scholar 

  6. Allauzen, C., Riley, M.D., Schalkwyk, J., Skut, W., Mohri, M.: OpenFst: A general and efficient weighted finite-state transducer library. In: Holub, J., Žďárek, J. (eds.) CIAA 2007. LNCS, vol. 4783, pp. 11–23. Springer, Heidelberg (2007)

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Vavruška, J., Švec, J., Ircing, P. (2013). Phonetic Spoken Term Detection in Large Audio Archive Using the WFST Framework. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-40585-3_51

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-40584-6

  • Online ISBN: 978-3-642-40585-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics