An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

Tetariy, Ella; Gishri, Michal; Har-Lev, Baruch; Aharonson, Vered; Moyal, Ami

doi:10.1007/s10772-012-9171-3

An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

Published: 04 August 2012

Volume 16, pages 161–169, (2013)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

Ella Tetariy¹,
Michal Gishri¹,
Baruch Har-Lev¹,
Vered Aharonson¹ &
…
Ami Moyal¹

342 Accesses
Explore all metrics

Abstract

This paper describes an algorithm for the reduction of computational complexity in phonetic search KeyWord Spotting (KWS). This reduction is particularly important when searching for keywords within very large speech databases and aiming for rapid response time. The suggested algorithm consists of an anchor-based phoneme search that reduces the search space by generating hypotheses only around phonemes recognized with high reliability. Three databases have been used for the evaluation: IBM Voicemail I and Voicemail II, consisting of long spontaneous utterances and the Wall Street Journal portion of the MACROPHONE database, consisting of read speech utterances. The results indicated a significant reduction of nearly 90 % in the computational complexity of the search while improving the false alarm rate, with only a small decrease in the detection rate in both databases. Search space reduction, as well as, performance gain or loss can be controlled according to the user preferences via the suggested algorithm parameters and thresholds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Very Fast Keyword Spotting System with Real Time Factor Below 0.01

Phoneme Segmentation-Based Unsupervised Pattern Discovery and Clustering of Speech Signals

Article 15 November 2021

Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Alon, G. (2005). Key-word spotting—the base technology for speech analytics. Rishon Lezion: NSC—Natural Speech Communication.
Amir, A., Efrat, A., & Srinivasan, S. (2001). Advances in phonetic word spotting. In Proceedings of the tenth international conference on information and knowledge management (pp. 580–582). Atlanta.
Google Scholar
Bernstein, J., Taussig, K., & Godfrey, J. (1994). MACROPHONE. Philadelphia, USA: Linguistic Data Consortium (LDC).
Clements, M., Cardillo, P., & Miller, M. (2001). Phonetic searching of digital audio. In Proceedings of the broadcast engineering conference (pp. 131–140). Washington.
Google Scholar
Gishri, M., & Silber-Varod, V. (2010). Lexicon design for transcription of spontaneous voice messages. In Proceedings of the seventh conference on international language resources and evaluation. Valetta.
Google Scholar
Gusfield, D. (1997). Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge: Cambridge University Press.
Book MATH Google Scholar
Hermelin, D., Landau, G. M., Landau, S., & Weimann, O. (2009). A unified algorithm for accelerating edit distance computation via text compression. In Proceedings of the 26th international symposium on theoretical aspects of computer science.
Google Scholar
James, D. A., & Young, S. J. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the international conference on acoustics, speech, and signal processing (Vol. 1, pp. 337–380). Adelaide: IEEE Comput. Soc.
Google Scholar
Padmanabhan, M., Ramaswamy, G., Ramabhadran, B., Gopalakrishnan, P. S., & Dunn, C. (1998). Voicemail Corpus I. Philadelphia, USA: Linguistic Data Consortium (LDC).
Padmanabhan, M., Kingsbury, B., Ramabhadran, B., Huang, J., Stanley, C., Saon, G., et al. (2002). Voicemail Corpus Part II. Philadelphia, USA: Linguistic Data Consortium (LDC).
Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. In Proceedings of the tenth international conference on spoken language processing. Antwerp.
Google Scholar
Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karfiát, M., & Fapšo, M., et al. (2005). Comparison of keyword spotting approaches for informal continuous speech. In Proceedings of interspeech (pp. 633–636). Lisbon.
Google Scholar
Tetariy, E., Aharonson, V., & Moyal, A. (2010). Phonetic search using an anchor-based algorithm. In Proceedings of the 26th convention of electrical and electronics engineers in Israel. Eilat.
Google Scholar
Thambiratnam, K., & Sridharan, S. (2005). Dynamic match phone-lattice searches for very fast and accurate unrestricted vocabulary keyword spotting. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP). Philadelphia.
Google Scholar
Wilpon, J. G., Rabiner, L. R., Lee, C. H., & Goldman, E. R. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(11), 1870–1878.
Article Google Scholar
Yu, P., & Seide, F. (2004). A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. In Proceedings of the first international conference on logistics strategy for ports. Dalian.
Google Scholar

Download references

Acknowledgements

This research was partially funded (anchor-based phonetic search) by the Chief Scientist of the Israeli Ministry of Commerce as part of a Magneton research grant #41914, “An Efficient Algorithm for Voicemail Transcription.”

Author information

Authors and Affiliations

ACLP—Afeka Center for Language Processing, Afeka Academic College of Engineering, Tel Aviv, Israel
Ella Tetariy, Michal Gishri, Baruch Har-Lev, Vered Aharonson & Ami Moyal

Authors

Ella Tetariy
View author publications
You can also search for this author in PubMed Google Scholar
Michal Gishri
View author publications
You can also search for this author in PubMed Google Scholar
Baruch Har-Lev
View author publications
You can also search for this author in PubMed Google Scholar
Vered Aharonson
View author publications
You can also search for this author in PubMed Google Scholar
Ami Moyal
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michal Gishri.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tetariy, E., Gishri, M., Har-Lev, B. et al. An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases. Int J Speech Technol 16, 161–169 (2013). https://doi.org/10.1007/s10772-012-9171-3

Download citation

Received: 17 May 2012
Accepted: 20 July 2012
Published: 04 August 2012
Issue Date: June 2013
DOI: https://doi.org/10.1007/s10772-012-9171-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Very Fast Keyword Spotting System with Real Time Factor Below 0.01

Phoneme Segmentation-Based Unsupervised Pattern Discovery and Clustering of Speech Signals

Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Very Fast Keyword Spotting System with Real Time Factor Below 0.01

Phoneme Segmentation-Based Unsupervised Pattern Discovery and Clustering of Speech Signals

Complexity of the TDNN Acoustic Model with Respect to the HMM Topology

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation