Skip to main content

Advertisement

Log in

Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper proposes a new feature extraction technique using wavelet based sub-band parameters (WBSP) for classification of unaspirated Hindi stop consonants. The extracted acoustic parameters show marked deviation from the values reported for English and other languages, Hindi having distinguishing manner based features. Since acoustic parameters are difficult to be extracted automatically for speech recognition.

Mel Frequency Cepstral Coefficient (MFCC) based features are usually used. MFCC are based on short time Fourier transform (STFT) which assumes the speech signal to be stationary over a short period. This assumption is specifically violated in case of stop consonants.

In WBSP, from acoustic study, the features derived from CV syllables have different weighting factors with the middle segment having the maximum. The wavelet transform has been applied to splitting of signal into 8 sub-bands of different bandwidths and the variation of energy in different sub-bands is also taken into account. WBSP gives improved classification scores. The number of filters used (8) for feature extraction in WBSP is less compared to the number (24) used for MFCC. Its classification performance has been compared with four other techniques using linear classifier. Further, Principal components analysis (PCA) has also been applied to reduce dimensionality.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  • Athineos, M., & Ellis, D. P. (2003). Frequency-domain linear prediction for temporal features. In Proc. ASRU (pp. 261–266).

    Google Scholar 

  • Chandra, M. (2007). Speech classification using wavelet transform. Ph.D. thesis submitted to A.M.U., Aligarh, India.

  • Chang, S., Kwon, Y., & Yang, S. (1998). Speech feature extracted from adaptive wavelet for speech classification. Electronics Letters, 34, 2211–2213.

    Article  Google Scholar 

  • Chen, S. H. (2002). A study on speech signal processing using wavelet transforms. Ph.D. dissertation submitted to National Cheng Kung, University Tinan, Taiwan, and Republic of China.

  • Duda, R. O., Hart, P. E., & Stork, G. (2001). Pattern classification (2nd ed.). New York: Wiley.

    MATH  Google Scholar 

  • Farooq, O., & Datta, S. (2001). Mel filter-like admissible wavelet packet structure for speech classification. IEEE Signal Processing Letters, 8(7), 196–198.

    Article  Google Scholar 

  • Farooq, O., & Datta, S. (2003). Phoneme recognition using wavelet based features. Journal of Information Sciences, 150(1–2), 5–15.

    Article  Google Scholar 

  • Farooq, O., & Datta, S. (2007). Evaluation of a wavelet based ASR front-end. International Journal on Wavelets and Multiresolution Processing, 5(4), 641–654.

    Article  Google Scholar 

  • Fukunaga, K. (1990). Introduction to statistical pattern classification. San Diego: Academic Press.

    Google Scholar 

  • Huber, R., Ramoser, H., Mayer, K., Penz, H., & Rubik, M. (2005). Classification of coins using an eigenspace approach. Pattern Classification Letters, 26(1), 61–75.

    Article  Google Scholar 

  • Jiang, H., Joo, M., & Gao, Y. (2003). Feature extraction using wavelet packets strategy. In Proceedings of the 42nd IEEE conference on decision and control, Maui, Hawaii, USA (pp. 4517–4520).

    Google Scholar 

  • Katz, M., Meier, H. G., Dolfing, H., & Klakow, D. (2002). Robustness of linear discriminant analysis in automatic speech classification. In Proc. international conference on pattern classification, Québec, Canada (Vol. 3, pp. 30371–30374).

    Google Scholar 

  • Krishnan, M., Neophytou, C. P., & Prescott, G. (1994). Wavelet transform speech classification using vector quantization, dynamic time warping and artificial neural networks. In International conference on spoken language process, Yokohama, Japan.

    Google Scholar 

  • Mallat, S. (1998). A wavelet tour of signal processing (2nd ed.). New York: Academic Press.

    MATH  Google Scholar 

  • Partridge, M., & Calvo, R. (1997). Fast dimensionality reduction and simple PCA. Intelligent Data Analysis, 2(3), 292–298.

    Google Scholar 

  • Posadas, A. M., Vidal, , F. de Miguel, Alguacil, G., Pena, J., Ibanez, J. M., & Morales, J. (1993). Spatial-temporal analysis of a seismic series using the principal components method. Journal of Geophysical Research, 98(B2), 1923–1932.

    Article  Google Scholar 

  • Rabiner, L. R., & Juang, B. H. (2003). Fundamental of speech classification (1st ed.). Delhi: Pearson Education.

    Google Scholar 

  • Sekhar, C. C., & Yegnanarayana, B. (2002). A constraint satisfaction model for classification of stop consonant–vowel (SCV) utterances. IEEE Transactions on Speech and Audio Processing, 10(7), 472–480.

    Article  Google Scholar 

  • Sharma, R. P. (2008). Recognition of (Hindi) stop consonants. Unpublished Ph.D. thesis submitted to A.M.U., Aligarh, India.

  • Suchato, A. (2004). Classification of stop consonant place of articulation. Ph.D. dissertation submitted to Massachusetts Institute of Technology.

  • Tufekci, Z., & Gowdy, J. N. (2000). Feature extraction using discrete wavelet transform for speech classification. In IEEE, SoutheastCon, Nashville, Tennessee, USA (pp. 116–123).

    Google Scholar 

  • Turk, M. A., & Pentland, A. P. (1991). Face classification using eigenfaces. In Proceedings of the computer vision and pattern classification (pp. 586–591).

    Google Scholar 

  • Van der Maaten, L. J. P. (2007). An introduction to dimensionality reduction using Matlab. MICC Report, Maastricht University.

  • Wang, K., Lee, K., & Juang, B. H. (1997). Selective feature extraction via signal decomposition. IEEE Signal Processing Letters, 4, 8–11.

    Article  Google Scholar 

  • Xueying, Z., & Jing, B. (2006). The speech classification based on the bark wavelet and CZCPA features. In IEEE international conference on intelligent robots and systems, October 9–15, Beijing, China (pp. 318–321).

    Google Scholar 

  • Yoo, S., Boston, J. R., Durrant, J. D., Kovacyk, K., Karn, S., & Shaiman, S. E. J. (2005). Relative energy and intelligibility of transient speech components. Proceedings of IEEE ICASSP, 1, 69–72.

    Google Scholar 

Download references

Acknowledgements

We are thankful to the reviewers for providing important suggestions and constructive comments which have helped us improve the quality of the paper. We also wish to thank Mr. S. Hasan Shahid Rizvi for providing valuable help in reshaping this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to R. P. Sharma.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sharma, R.P., Farooq, O. & Khan, I. Wavelet based sub-band parameters for classification of unaspirated Hindi stop consonants in initial position of CV syllables. Int J Speech Technol 16, 323–332 (2013). https://doi.org/10.1007/s10772-012-9185-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-012-9185-x

Keywords

Navigation