Abstract
This paper presents a study on keyword spotting systems based on acoustic similarity between a filler model and keyword model. The ratio between the keyword model likelihood and the generic (filler) model likelihood is used by the classifier to detect relevant peaks values that indicate keyword occurrences. We have changed the standard scheme of keyword spotting system to allow keyword detection in a single forward step. We propose a new log-likelihood ratio normalization to minimize the effect of word length on the classifier performance. Tests show the effectiveness of our normalization method against two other methods. Experiments were performed on continuous speech utterances of the Portuguese TECNOVOZ database (read sentences) with keywords of several lengths.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bridle, J.S.: An Efficient Elastic-Template Method for Detecting Given Words in Running Speech. In: Proc. of the British Acoustical Society Meeting (1973)
Higgins, A., Wohlford, R.: Keyword Recognition Using Template Concatenation. In: Proc. of the International Conference on Acoustics, Speech, and Signal Processing, vol. 10, pp. 1233–1236 (1985)
Rabiner, L.R.: A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Wöllmer, M., Schuller, B., Rigoll, G.: Keyword Spotting Exploiting Long Short-Term Memory. Speech Communication 55, 252–265 (2013)
Zhu, Q., Chen, B., Morgan, N., Stolcke, A.: Tandem Connectionist Feature Extraction for Conversational Speech Recognition. In: Bengio, S., Bourlard, H. (eds.) MLMI 2004. LNCS, vol. 3361, pp. 223–231. Springer, Heidelberg (2005)
Szoke, I., Schwarz, P., Matejka, P., Burget, L., Karafiát, M., Fapso, M., Cernocky, J.: Comparison of Keyword Spotting Approaches for Informal Continuous Speech. In: Proc. of the 9th European Conference on Speech Communication and Technology, Lisbon, Portugal (2005)
Rohlicek, J.R., Russell, W., Roukos, S., Gish, H.: Continuous Hidden Markov Modeling for Speaker-Independent Word Spotting. In: Proc. of the International Conference on Acoustics, Speech, and Signal Processing, pp. 627–630 (1989)
Rose, R.C., Paul, D.B.: A Hidden Markov Model Based Keyword Recognition System. In: Proc. of the International Conference on Acoustics, Speech, and Signal Processing, vol. 1, pp. 129–132 (1990)
Young, S., Russell, N.H., Thornton, M.: Token Passing: A Simple Conceptual Model for Connected Speech Recognition Systems. Cambridge University Engineering Department, Cambridge (1989)
Junkawitsch, J., Ruske, G., Höge, H.: Efficient Methods for Detecting Keywords in Continuous Speech. In: Proc. of the 5th European Conference on Speech Communication and Technology, Rhodes, Greece, pp. 259–262 (1997)
Weintraub, M.: Keyword-Spotting Using SRI’s DECIPHER Large-Vocabulary Speech-Recognition System. In: Proc. of the International Conference on Acoustics, Speech, and Signal Processing, vol. 2, pp. 463–466 (1993)
Papoulis, A.: Probability, Random Variables and Stochastic Processes, 3rd edn. McGraw-Hill Companies (1991)
Lopes, J., Neves, C., Veiga, A., Maciel, A., Lopes, C., Perdigão, F., Sá, L.: Development of a Speech Recognizer with the Tecnovoz Database. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 260–263. Springer, Heidelberg (2008)
Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Liu, X., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.: The HTK Book (for HTK Version 3.4). Cambridge University Engineering Department, Cambridge (2006)
Egan, J.P.: Signal Detection Theory and ROC Analysis. Academic Press, New York (1975)
Martin, A., Doddington, G., Kamm, T., Ordowski, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. DTIC Document (1997)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Veiga, A., Lopes, C., Sá, L., Perdigão, F. (2014). Acoustic Similarity Scores for Keyword Spotting. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-09761-9_5
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09760-2
Online ISBN: 978-3-319-09761-9
eBook Packages: Computer ScienceComputer Science (R0)