Abstract
This paper proposes a new feature extraction technique using wavelet based sub-band parameters (WBSP) for classification of unaspirated Hindi stop consonants. The extracted acoustic parameters show marked deviation from the values reported for English and other languages, Hindi having distinguishing manner based features. Since acoustic parameters are difficult to be extracted automatically for speech recognition.
Mel Frequency Cepstral Coefficient (MFCC) based features are usually used. MFCC are based on short time Fourier transform (STFT) which assumes the speech signal to be stationary over a short period. This assumption is specifically violated in case of stop consonants.
In WBSP, from acoustic study, the features derived from CV syllables have different weighting factors with the middle segment having the maximum. The wavelet transform has been applied to splitting of signal into 8 sub-bands of different bandwidths and the variation of energy in different sub-bands is also taken into account. WBSP gives improved classification scores. The number of filters used (8) for feature extraction in WBSP is less compared to the number (24) used for MFCC. Its classification performance has been compared with four other techniques using linear classifier. Further, Principal components analysis (PCA) has also been applied to reduce dimensionality.
Similar content being viewed by others
References
Athineos, M., & Ellis, D. P. (2003). Frequency-domain linear prediction for temporal features. In Proc. ASRU (pp. 261–266).
Chandra, M. (2007). Speech classification using wavelet transform. Ph.D. thesis submitted to A.M.U., Aligarh, India.
Chang, S., Kwon, Y., & Yang, S. (1998). Speech feature extracted from adaptive wavelet for speech classification. Electronics Letters, 34, 2211–2213.
Chen, S. H. (2002). A study on speech signal processing using wavelet transforms. Ph.D. dissertation submitted to National Cheng Kung, University Tinan, Taiwan, and Republic of China.
Duda, R. O., Hart, P. E., & Stork, G. (2001). Pattern classification (2nd ed.). New York: Wiley.
Farooq, O., & Datta, S. (2001). Mel filter-like admissible wavelet packet structure for speech classification. IEEE Signal Processing Letters, 8(7), 196–198.
Farooq, O., & Datta, S. (2003). Phoneme recognition using wavelet based features. Journal of Information Sciences, 150(1–2), 5–15.
Farooq, O., & Datta, S. (2007). Evaluation of a wavelet based ASR front-end. International Journal on Wavelets and Multiresolution Processing, 5(4), 641–654.
Fukunaga, K. (1990). Introduction to statistical pattern classification. San Diego: Academic Press.
Huber, R., Ramoser, H., Mayer, K., Penz, H., & Rubik, M. (2005). Classification of coins using an eigenspace approach. Pattern Classification Letters, 26(1), 61–75.
Jiang, H., Joo, M., & Gao, Y. (2003). Feature extraction using wavelet packets strategy. In Proceedings of the 42nd IEEE conference on decision and control, Maui, Hawaii, USA (pp. 4517–4520).
Katz, M., Meier, H. G., Dolfing, H., & Klakow, D. (2002). Robustness of linear discriminant analysis in automatic speech classification. In Proc. international conference on pattern classification, Québec, Canada (Vol. 3, pp. 30371–30374).
Krishnan, M., Neophytou, C. P., & Prescott, G. (1994). Wavelet transform speech classification using vector quantization, dynamic time warping and artificial neural networks. In International conference on spoken language process, Yokohama, Japan.
Mallat, S. (1998). A wavelet tour of signal processing (2nd ed.). New York: Academic Press.
Partridge, M., & Calvo, R. (1997). Fast dimensionality reduction and simple PCA. Intelligent Data Analysis, 2(3), 292–298.
Posadas, A. M., Vidal, , F. de Miguel, Alguacil, G., Pena, J., Ibanez, J. M., & Morales, J. (1993). Spatial-temporal analysis of a seismic series using the principal components method. Journal of Geophysical Research, 98(B2), 1923–1932.
Rabiner, L. R., & Juang, B. H. (2003). Fundamental of speech classification (1st ed.). Delhi: Pearson Education.
Sekhar, C. C., & Yegnanarayana, B. (2002). A constraint satisfaction model for classification of stop consonant–vowel (SCV) utterances. IEEE Transactions on Speech and Audio Processing, 10(7), 472–480.
Sharma, R. P. (2008). Recognition of (Hindi) stop consonants. Unpublished Ph.D. thesis submitted to A.M.U., Aligarh, India.
Suchato, A. (2004). Classification of stop consonant place of articulation. Ph.D. dissertation submitted to Massachusetts Institute of Technology.
Tufekci, Z., & Gowdy, J. N. (2000). Feature extraction using discrete wavelet transform for speech classification. In IEEE, SoutheastCon, Nashville, Tennessee, USA (pp. 116–123).
Turk, M. A., & Pentland, A. P. (1991). Face classification using eigenfaces. In Proceedings of the computer vision and pattern classification (pp. 586–591).
Van der Maaten, L. J. P. (2007). An introduction to dimensionality reduction using Matlab. MICC Report, Maastricht University.
Wang, K., Lee, K., & Juang, B. H. (1997). Selective feature extraction via signal decomposition. IEEE Signal Processing Letters, 4, 8–11.
Xueying, Z., & Jing, B. (2006). The speech classification based on the bark wavelet and CZCPA features. In IEEE international conference on intelligent robots and systems, October 9–15, Beijing, China (pp. 318–321).
Yoo, S., Boston, J. R., Durrant, J. D., Kovacyk, K., Karn, S., & Shaiman, S. E. J. (2005). Relative energy and intelligibility of transient speech components. Proceedings of IEEE ICASSP, 1, 69–72.
Acknowledgements
We are thankful to the reviewers for providing important suggestions and constructive comments which have helped us improve the quality of the paper. We also wish to thank Mr. S. Hasan Shahid Rizvi for providing valuable help in reshaping this paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Sharma, R.P., Farooq, O. & Khan, I. Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables. Int J Speech Technol 16, 323–332 (2013). https://doi.org/10.1007/s10772-012-9185-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-012-9185-x