Abstract
This paper describes an algorithm for the reduction of computational complexity in phonetic search KeyWord Spotting (KWS). This reduction is particularly important when searching for keywords within very large speech databases and aiming for rapid response time. The suggested algorithm consists of an anchor-based phoneme search that reduces the search space by generating hypotheses only around phonemes recognized with high reliability. Three databases have been used for the evaluation: IBM Voicemail I and Voicemail II, consisting of long spontaneous utterances and the Wall Street Journal portion of the MACROPHONE database, consisting of read speech utterances. The results indicated a significant reduction of nearly 90 % in the computational complexity of the search while improving the false alarm rate, with only a small decrease in the detection rate in both databases. Search space reduction, as well as, performance gain or loss can be controlled according to the user preferences via the suggested algorithm parameters and thresholds.











Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alon, G. (2005). Key-word spotting—the base technology for speech analytics. Rishon Lezion: NSC—Natural Speech Communication.
Amir, A., Efrat, A., & Srinivasan, S. (2001). Advances in phonetic word spotting. In Proceedings of the tenth international conference on information and knowledge management (pp. 580–582). Atlanta.
Bernstein, J., Taussig, K., & Godfrey, J. (1994). MACROPHONE. Philadelphia, USA: Linguistic Data Consortium (LDC).
Clements, M., Cardillo, P., & Miller, M. (2001). Phonetic searching of digital audio. In Proceedings of the broadcast engineering conference (pp. 131–140). Washington.
Gishri, M., & Silber-Varod, V. (2010). Lexicon design for transcription of spontaneous voice messages. In Proceedings of the seventh conference on international language resources and evaluation. Valetta.
Gusfield, D. (1997). Algorithms on strings, trees, and sequences: computer science and computational biology. Cambridge: Cambridge University Press.
Hermelin, D., Landau, G. M., Landau, S., & Weimann, O. (2009). A unified algorithm for accelerating edit distance computation via text compression. In Proceedings of the 26th international symposium on theoretical aspects of computer science.
James, D. A., & Young, S. J. (1994). A fast lattice-based approach to vocabulary independent wordspotting. In Proceedings of the international conference on acoustics, speech, and signal processing (Vol. 1, pp. 337–380). Adelaide: IEEE Comput. Soc.
Padmanabhan, M., Ramaswamy, G., Ramabhadran, B., Gopalakrishnan, P. S., & Dunn, C. (1998). Voicemail Corpus I. Philadelphia, USA: Linguistic Data Consortium (LDC).
Padmanabhan, M., Kingsbury, B., Ramabhadran, B., Huang, J., Stanley, C., Saon, G., et al. (2002). Voicemail Corpus Part II. Philadelphia, USA: Linguistic Data Consortium (LDC).
Pucher, M., Türk, A., Ajmera, J., & Fecher, N. (2007). Phonetic distance measures for speech recognition vocabulary and grammar optimization. In Proceedings of the tenth international conference on spoken language processing. Antwerp.
Szöke, I., Schwarz, P., Matějka, P., Burget, L., Karfiát, M., & Fapšo, M., et al. (2005). Comparison of keyword spotting approaches for informal continuous speech. In Proceedings of interspeech (pp. 633–636). Lisbon.
Tetariy, E., Aharonson, V., & Moyal, A. (2010). Phonetic search using an anchor-based algorithm. In Proceedings of the 26th convention of electrical and electronics engineers in Israel. Eilat.
Thambiratnam, K., & Sridharan, S. (2005). Dynamic match phone-lattice searches for very fast and accurate unrestricted vocabulary keyword spotting. In Proceedings of IEEE international conference on acoustics, speech, and signal processing (ICASSP). Philadelphia.
Wilpon, J. G., Rabiner, L. R., Lee, C. H., & Goldman, E. R. (1990). Automatic recognition of keywords in unconstrained speech using hidden Markov models. IEEE Transactions on Acoustics, Speech, and Signal Processing, 38(11), 1870–1878.
Yu, P., & Seide, F. (2004). A hybrid word/phoneme-based approach for improved vocabulary-independent search in spontaneous speech. In Proceedings of the first international conference on logistics strategy for ports. Dalian.
Acknowledgements
This research was partially funded (anchor-based phonetic search) by the Chief Scientist of the Israeli Ministry of Commerce as part of a Magneton research grant #41914, “An Efficient Algorithm for Voicemail Transcription.”
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tetariy, E., Gishri, M., Har-Lev, B. et al. An efficient lattice-based phonetic search method for accelerating keyword spotting in large speech databases. Int J Speech Technol 16, 161–169 (2013). https://doi.org/10.1007/s10772-012-9171-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-012-9171-3