Abstract
Classification of words plays a primary vital role to develop a robust automatic speech recognition (ASR) applications due to the diversity in the vocal tract of speakers. This paper presents Neural Network based word classification using the combination of features like, MFCC, Zero Crossing, Zero-Crossing Rate (ZCR) and Formants. The results of word classification are promising.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pikrakis, A., Giannakopoulos, T., Theodoridis, S.: A Speech/Music Discriminator of Radio Recordings Based on Dynamic Programming and Bayesian Networks. IEEE Transactions on Multimedia 10(5), 846–857 (2008)
Al-Haddad, S.A.R., Samad, S.A., Hussain, A., Ishak, K.A.: Isolated Malay Digit Recognition Using Pattern Recognition Fusion of Dynamic Time Warping and Hidden Markov Models. American Journal of Applied Sciences 5(6), 714–720 (2008)
Anusuya, M.A., Katti, S.K.: Speech Recognition by Machine: A Review. International Journal of Computer Science and Information Security 6(3), 181–205 (2009)
Maier, A., Haderlein, T., Stelzle, F., Noth, E., Nkenke, E., Rosanowski, F., Schutzenberger, A., Schuster, M.: Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer. EURASIP Journal on Audio, Speech, and Music Processing 2010 (2010)
Lee, C.-H., Hanand, C.-C., Chuang, C.-C.: Automatic Classification of Bird Species From Their Sounds Using Two-Dimensional Cepstral Coefficients. IEEE Transactions on Audio, Speech, and Language Processing 16(8), 1541–1550 (2008)
Jankowski Jr., C.R., Vo, H.-D.H., Lippmann, R.P.: A Comparison of Signal Processing Front Ends for Automatic Word Recognition. IEEE Transactions on Speech and Audio Processing 3(4), 286–293 (1995)
Hsieh, C.-T., Hsu, C.-H.: Speech Classification Based on Fuzzy Adaptive Resonance Theory. In: Proceedings of the 2006 Joint Conference on Information Sciences. Atlantis Press, Taiwan (2006)
Levy, C., Linares, G., Bonastre, J.-F.: Compact AcousticModels for Embedded Speech Recognition. EURASIP Journal on Audio, Speech, and Music Processing 2009 (2009)
Gläser, C., Heckmann, M., Joublin, F., Goerick, C.: Combining Auditory Preprocessing and Bayesian Estimation for Robust Formant Tracking. IEEE Transactions on Audio, Speech, and Language Processing 18(2), 224–236 (2010)
Yu, D., Deng, L., Droppo, J., Wu, J., Gong, Y., Acero, A.: A Minimum-Mean-Square-Error Noise Reduction Algorithm on Mel-Frequency Cepstra for Robust Speech Recognition. In: IEEE: ICASSP, pp. 4041–4044 (2008)
Kolossa, D., Astudillo, R.F., Hoffmann, E., Orglmeister, R.: Independent Component Analysis and Time-Frequency Masking for Speech Recognition in Multitalker Conditions. EURASIP Journal on Audio, Speech, and Music Processing 2010 (2010)
Al-Mubaid, H.: A Learning - Classification based Appro Word Prediction. The International Arab Journal of Information Technology 4(3), 264–271 (2007)
Nadeu, H.Y.C., Hohmann, V.: Pitch and Formant Based Order Adaptation of the Fractional Fourier Transformand Its Application to Speech Recognition. EURASIP Journal on Audio, Speech, and Music Processing 2009 (2009)
Boril, H., Hansen, J.H.L.: Unsupervised Equalization of Lombard Effect for Speech Recognition in Noisy Adverse Environments. IEEE Transactions on Audio, Speech, and Language Processing 18(6), 1379–1393 (2010)
Park, H., Takiguchi, T., Ariki, Y.: Integrated Phoneme SubspaceMethod for Speech Feature Extraction. EURASIP Journal on Audio, Speech, and Music Processing 2009 (2009)
Kang, J., Lee, H.: Automatic Voice Classification System Based on Traditional Korean Medicine. World Academy of Science, Engineering and Technology 56, 35–38 (2009)
Frankel, J., King, S.: Speech Recognition Using Linear Dynamic Models. IEEE Transactions on Audio, Speech, and Language Processing 15(1), 246–256 (2007)
Morales-Cordovilla, J.A., Peinado, A.M., Snchez, V., Gonzlez, J.A.: Feature Extraction Based on Pitch-Synchronous Averaging for Robust Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 19(3), 640–651 (2011)
Muda, L., Begam, M., Elamvazuthi, I.: Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques. Journal of Computing 2(3), 138–143 (2010)
Korba, M.C.A., Messadeg, D., Djemili, R., Bourouba, H.: Robust Speech Recognition Using Perceptual Wavelet Denoising and Mel-frequency Product Spectrum Cepstral Coefficient Features. Informatica 32, 283–288 (2008)
Morales, N., Toledano, D.T., Hansen, J.H.L., Garrido, J.: Feature Compensation Techniques for ASR on Band-Limited Speech. IEEE Transactions on Audio, Speech, and Language Processing 17(4), 758–774 (2009)
Wang, N., Ching, P.C., Zheng, N., Lee, T.: Robust Speaker Recognition Using Denoised Vocal Source and Vocal Tract Features. IEEE Transactions on Audio, Speech, and Language Processing 19(1), 196–205 (2011)
Panagiotakis, C., Tziritas, G.: A speech/music discriminator based on RMS and zero-crossings. IEEE Transactions on Multimedia 7(1), 155–166 (2005)
Dharanipragada, S., Yapanel, U.H., Rao, B.D.: Robust Feature Extraction for Continuous Speech Recognition Using the MVDR Spectrum Estimation Method. IEEE Transactions on Audio, Speech, and Language Processing 15(1), 224–234 (2007)
Scheirer, E., Slaney, M.: Construction and Evaluation of a Robust Multifeature Speech/Music Discriminator. In: Proc. Int. Conference on Acoustics, Speech and Signal Processing (ICASSP), vol. 2, pp. 1331–1334 (1997)
Zhang, T., Jay Kuo, C.C.: Audio content analysis for online audiovisual data segmentation and classification. IEEE Transactions on Speech and Audio Processing 9(4), 441–457 (2001)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Selvan, A.M., Rajesh, R. (2011). Word Classification Using Neural Network. In: Abraham, A., Mauri, J.L., Buford, J.F., Suzuki, J., Thampi, S.M. (eds) Advances in Computing and Communications. ACC 2011. Communications in Computer and Information Science, vol 192. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22720-2_52
Download citation
DOI: https://doi.org/10.1007/978-3-642-22720-2_52
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22719-6
Online ISBN: 978-3-642-22720-2
eBook Packages: Computer ScienceComputer Science (R0)