Skip to main content
Log in

An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper describes an algorithm for the reduction of computational complexity in phonetic search KeyWord Spotting (KWS). This reduction is particularly important when searching for keywords within very large speech databases and aiming for rapid response time. The suggested algorithm consists of an anchor-based phoneme search that reduces the search space by generating hypotheses only around phonemes recognized with high reliability. Three databases have been used for the evaluation: IBM Voicemail I and Voicemail II, consisting of long spontaneous utterances and the Wall Street Journal portion of the MACROPHONE database, consisting of read speech utterances. The results indicated a significant reduction of nearly 90 % in the computational complexity of the search while improving the false alarm rate, with only a small decrease in the detection rate in both databases. Search space reduction, as well as, performance gain or loss can be controlled according to the user preferences via the suggested algorithm parameters and thresholds.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Similar content being viewed by others

References

  • Alon, G. (2005). Key-word spotting—the base technology for speech analytics. Rishon Lezion: NSC—Natural Speech Communication.

  • Amir, A., Efrat, A., & Srinivasan, S. (2001). Advances in phonetic word spotting. In Proceedings of the tenth international conference on information and knowledge management (pp. 580–582). Atlanta.

    Google Scholar 

  • Bernstein, J., Taussig, K., & Godfrey, J. (1994). MACROPHONE. Philadelphia, USA: Linguistic Data Consortium (LDC).

  • Clements, M., Cardillo, P., & Miller, M. (2001). Phonetic searching of digital audio. In Proceedings of the broadcast engineering conference (pp. 131–140). Washington.

    Google Scholar 

  • Gishri, M., & Silber-Varod, V. (2010). Lexicon design for transcription of spontaneous voice messages. In Proceedings of the seventh conference on international language resources and evaluation. Valetta.

    Google Scholar 

  • Gusfield, D. (1997). Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge: Cambridge University Press.

    Book  MATH  Google Scholar 

  • Hermelin, D., Landau, G. M., Landau, S., & Weimann, O. (2009). A unified algorithm for accelerating edit distance computation via text compression. In Proceedings of the 26th international symposium on theoretical aspects of computer science.

    Google Scholar 

  • James, D. A., & Young, S. J. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the international conference on acoustics, speech, and signal processing (Vol. 1, pp. 337–380). Adelaide: IEEE Comput. Soc.

    Google Scholar 

  • Padmanabhan, M., Ramaswamy, G., Ramabhadran, B., Gopalakrishnan, P. S., & Dunn, C. (1998). Voicemail Corpus I. Philadelphia, USA: Linguistic Data Consortium (LDC).

  • Padmanabhan, M., Kingsbury, B., Ramabhadran, B., Huang, J., Stanley, C., Saon, G., et al. (2002). Voicemail Corpus Part II. Philadelphia, USA: Linguistic Data Consortium (LDC).

  • Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. In Proceedings of the tenth international conference on spoken language processing. Antwerp.

    Google Scholar 

  • Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karfiát, M., & Fapšo, M., et al. (2005). Comparison of keyword spotting approaches for informal continuous speech. In Proceedings of interspeech (pp. 633–636). Lisbon.

    Google Scholar 

  • Tetariy, E., Aharonson, V., & Moyal, A. (2010). Phonetic search using an anchor-based algorithm. In Proceedings of the 26th convention of electrical and electronics engineers in Israel. Eilat.

    Google Scholar 

  • Thambiratnam, K., & Sridharan, S. (2005). Dynamic match phone-lattice searches for very fast and accurate unrestricted vocabulary keyword spotting. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP). Philadelphia.

    Google Scholar 

  • Wilpon, J. G., Rabiner, L. R., Lee, C. H., & Goldman, E. R. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(11), 1870–1878.

    Article  Google Scholar 

  • Yu, P., & Seide, F. (2004). A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. In Proceedings of the first international conference on logistics strategy for ports. Dalian.

    Google Scholar 

Download references

Acknowledgements

This research was partially funded (anchor-based phonetic search) by the Chief Scientist of the Israeli Ministry of Commerce as part of a Magneton research grant #41914, “An Efficient Algorithm for Voicemail Transcription.”

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michal Gishri.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tetariy, E., Gishri, M., Har-Lev, B. et al. An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases. Int J Speech Technol 16, 161–169 (2013). https://doi.org/10.1007/s10772-012-9171-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-012-9171-3

Keywords

Navigation