Speaker Independent Word Recognition Using Cepstral Distance Measurement

Pramanik, Arnab; Raha, Rajorshee

doi:10.1007/978-3-642-32063-7_25

Arnab Pramanik³ &
Rajorshee Raha³

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 182))

1802 Accesses
2 Citations

Abstract

Speech recognition has been developed from theoretical methods practical systems. Since 90’s people have moved their interests to the difficult task of Large Vocabulary Continuous Speech Recognition (LVCSR) and indeed achieved a great progress. Meanwhile, many well-known research and commercial institutes have established their recognition systems including via Voice system IBM, Whisper system by Microsoft etc. In this paper we have developed a simple and efficient algorithm for the recognition of speech signal for speaker independent isolated word recognition system. We use Mel frequency cepstral coefficients (MFCCs) as features of the recorded speech. A decoding algorithm is proposed for recognizing the target speech computing the cepstral distance of the cepstral coefficients. Simulation experiments were carried using MATLAB here the method produced relatively good (85% word recognition accuracy) results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Huang, X.D., Lee, K.F.: Phonene classification using semicontinuous hidden markov models. IEEE Trans. on Signal Processessing 40(5), 1962–1067 (1992)
Google Scholar
Levinson, S.E., Rabiner, L.R., Juang, B.H., Sondhi, M.M.: Recognition of isolated digits using hidden markov models with continuous mixture densities. AT & T Technical Journal 64(6), 1211–1234 (1985)
MathSciNet Google Scholar
Acero, Acoustical and environmental robustness in automatic speech recognition. Kluwer Academic Pubs. (1993)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice Hall (1978)
Google Scholar
Jelinek, F.: Continuous Speech Recognition by Statisical Methods. IEEE Proceedings 64(4), 532–556 (1976)
Article Google Scholar
Young, S.: A Review of Large-Vocabulary Continuous Speech Recognition. IEEE Signal Processing Magazine, 45–57 (September 1996)
Google Scholar
Rabiner, L.R., Juang, B.-H.: Fundamentals of Speech Recognition. Prentice-Hall (1993)
Google Scholar
Mel Frequency Cepstral Coefficients: An Evaluation of Robustness of MP3 Encoded Music by Sigurdur Sigurdsson, Kaare Brandt Petersen and TueLehn-Schiøler
Google Scholar
Speech and speaker recognition: A tutorial by Samudravijaya, K., Young, S.J.: The general use of tying in phoneme-based hmm speech recognisers. In: Proceedings of ICASSP (1992)
Google Scholar
Nefian, A.V., Liang, L., Pi, X., Liu, X., Mao, C.: An coupled hidden Markov model for audio-visual speech recognition. In: International Conference on Acoustics, Speech and Signal Processing (2002)
Google Scholar
Neti, C., Potamianos, G., Luettin, J., Matthews, I., Vergyri, D., Sison, J., Mashari, A., Zhou, J.: Audio visual speech recognition. In: Final Workshop 2000 Report (2000)
Google Scholar
Oerder, M., Ney, H.: Word graphs: an efficient interface between continuous-speech recognition and language understanding. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 2 (1993)
Google Scholar
Potamianos, G., Luettin, J., Neti, C.: Asynchronous stream modelling for large vocabulary audio-visual speech recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing, vol. 1, pp. 169–172 (2001)
Google Scholar
Dupont, S., Luettin, J.: Audio-visual speech modeling for continuous speech recognition. IEEE Transactions on Multimedia 151 (September 2000)
Google Scholar

Download references

Author information

Authors and Affiliations

G S Sanyal School of Telecommunication, Indian Institute of Technology, Kharagpur, India
Arnab Pramanik & Rajorshee Raha

Authors

Arnab Pramanik
View author publications
You can also search for this author in PubMed Google Scholar
Rajorshee Raha
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnab Pramanik .

Editor information

Editors and Affiliations

(MIR Labs), Scientific Network for Innovation and, Machine Intelligence Research Labs, MIR Labs Campus, Auburn, 98071, Washington, USA
Ajith Abraham
Technology and Management, Indian Institute of Information, Technopark Campus, Trivandrum, 695581, India
Sabu M Thampi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pramanik, A., Raha, R. (2013). Speaker Independent Word Recognition Using Cepstral Distance Measurement. In: Abraham, A., Thampi, S. (eds) Intelligent Informatics. Advances in Intelligent Systems and Computing, vol 182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32063-7_25

Download citation

DOI: https://doi.org/10.1007/978-3-642-32063-7_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32062-0
Online ISBN: 978-3-642-32063-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics