Abstract
Speech recognition has been developed from theoretical methods practical systems. Since 90’s people have moved their interests to the difficult task of Large Vocabulary Continuous Speech Recognition (LVCSR) and indeed achieved a great progress. Meanwhile, many well-known research and commercial institutes have established their recognition systems including via Voice system IBM, Whisper system by Microsoft etc. In this paper we have developed a simple and efficient algorithm for the recognition of speech signal for speaker independent isolated word recognition system. We use Mel frequency cepstral coefficients (MFCCs) as features of the recorded speech. A decoding algorithm is proposed for recognizing the target speech computing the cepstral distance of the cepstral coefficients. Simulation experiments were carried using MATLAB here the method produced relatively good (85% word recognition accuracy) results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Huang, X.D., Lee, K.F.: Phonene classification using semicontinuous hidden markov models. IEEE Trans. on Signal Processessing 40(5), 1962–1067 (1992)
Levinson, S.E., Rabiner, L.R., Juang, B.H., Sondhi, M.M.: Recognition of isolated digits using hidden markov models with continuous mixture densities. AT & T Technical Journal 64(6), 1211–1234 (1985)
Acero, Acoustical and environmental robustness in automatic speech recognition. Kluwer Academic Pubs. (1993)
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)
Jelinek, F.: Continuous Speech Recognition by Statisical Methods. IEEE Proceedings 64(4), 532–556 (1976)
Young, S.: A Review of Large-Vocabulary Continuous Speech Recognition. IEEE Signal Processing Magazine, 45–57 (September 1996)
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)
Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music by Sigurdur Sigurdsson, Kaare Brandt Petersen and TueLehn-Schiøler
Speech and speaker recognition: A tutorial by Samudravijaya, K., Young, S.J.: The general use of tying in phoneme-based hmm speech recognisers. In: Proceedings of ICASSP (1992)
Nefian, A.V., Liang, L., Pi, X., Liu, X., Mao, C.: An coupled hidden Markov model for audio-visual speech recognition. In: International Conference on Acoustics, Speech and Signal Processing (2002)
Neti, C., Potamianos, G., Luettin, J., Matthews, I., Vergyri, D., Sison, J., Mashari, A., Zhou, J.: Audio visual speech recognition. In: Final Workshop 2000 Report (2000)
Oerder, M., Ney, H.: Word graphs: an efficient interface between continuous-speech recognition and language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2 (1993)
Potamianos, G., Luettin, J., Neti, C.: Asynchronous stream modelling for large vocabulary audio-visual speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 169–172 (2001)
Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia 151 (September 2000)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pramanik, A., Raha, R. (2013). Speaker Independent Word Recognition Using Cepstral Distance Measurement. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-642-32063-7_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)