Skip to main content
Log in

Time–domain non-linear feature parameter for consonant classification

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

This paper introduces an accurate time–domain approach to model and classify the Malayalam consonant-Vowel (CV) speech unit waveforms. The technique is based on statistical models of Reconstructed State Space (RSS). A feature extraction method using RSS based State Space Point Distribution (SSPD) parameters are studied. The results of the simulation experiment performed on the Malayalam CV speech databases using Artificial Neural Network (ANN) and k-Nearest Neighborhood (k-NN) classifiers are also presented. The results indicate that the efficiency of the RSS approach is capable of increasing speaker independent consonant speech recognition accuracy.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Aiyar, S. (1987). Dravidian theories, p. 286.

  • Anitha, R., Srikrishna Satish, D., & Chandra Shekhar, C. (2004). Outerproduct of trajectory matrix for acoustic modelling using support vector machines. In IEEE workshop on machine learning for signal processing (pp. 355–363).

    Google Scholar 

  • Baker, G. L., & Gollub, J. (1996). Chaotic dynamics: An introduction. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Banbrook, M., & McLaughlin, S. (1994). Is speech chaotic? In Proceedings. IEE colloq. exploiting chaos in signal processing (pp. 1–8).

    Google Scholar 

  • Broomhead, D. S., & King, G. P. (1986). Extracting qualitative dynamics from experimental data. Physica D, 217–236.

  • Casdagli, M. (1991). Chaos and deterministic versus stochastic nonlinear modeling. Journal of the Royal Statistical Society. Series B, 54, 303–328.

    MathSciNet  Google Scholar 

  • Cover, T. M., & Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13(1), 21–27.

    Article  MATH  Google Scholar 

  • Cutajar, M., Gatt, E., Grech, I., Casha, O., & Micallef, J. (2011). Neural network architectures for speaker independent phoneme recognition. In 7th international symposium on image and signal processing analysis, Croatia (pp. 90–95).

    Google Scholar 

  • Duda, R. O., & Hart, P. E. (1973). Pattern classification and scene analysis. New York: Wiley.

    MATH  Google Scholar 

  • Duda, R. O., Hart, P. E., & Stork, D. G. (2006). Pattern classification. New York: Wiley.

    Google Scholar 

  • Friedmen, M., & Kandel, A. (1999). Introduction to pattern recognition: Statistical, structural, neural and fuzzy logic approach. Singapore: World Scientific.

    Google Scholar 

  • Govindaraju, V., & Setlur, S. (2009). Advances in pattern recognition. Guide to OCR for Indic scripts: Document recognition and retrieval. Berlin: Springer. (p. 126).

    Google Scholar 

  • Hand, D. J. (1981). Discrimination and classification. New York: Wiley.

    MATH  Google Scholar 

  • Haykin, S. (2004). Neural networks: A comprehensive foundation. New Delhi: Prentice Hall of India Pvt. Ltd.

    Google Scholar 

  • Johnson, M. T., Povinalli, R. J., Lindgren, A. C., Ye, J., Liu, X., & Indrebo, K. (2005). Time domain isolated phoneme classification using reconstructed phase space. IEEE Transactions on Speech and Audio Processing, 13(4), 458–466.

    Article  Google Scholar 

  • Jurafsky, D., & Martin, J. H. (2004). An introduction to natural language processing, computational linguistics, and speech recognition. Upper Saddle River: Pearson Education.

    Google Scholar 

  • Kantz, H., & Schreiber, T. (1997). Non linear time series analysis. Cambridge: Cambridge University Press.

    Google Scholar 

  • Kohonen, T. (1988). An introduction to neural computing. Neural Networks.

  • Kwon, O.-W., Chan, K., & Lee, T.-W. (2003). Speech feature analysis using variational Bayesian PCA. IEEE Signal Processing Letters, 10, 5.

    Google Scholar 

  • Ladefoged, P. (2004). Vowels and consonants—an introduction to the sounds of language. Oxford: Blackwell.

    Google Scholar 

  • Lajish, V. L. (2007). Adaptive neuro-fuzzy inference based pattern recognition studies on handwritten character images. PhD Thesis, University of Calicut.

  • Lippmann, R. P. (1987). An introduction to computing with neural nets. IEEE Transactions on Acoustic, Speech, and Signal Processing Magazine, 61, 4–22.

    Google Scholar 

  • McCullough, W. C., & Pitts, W. H. (1943). A logical calculus of ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5, 115–133.

    Article  Google Scholar 

  • Narayanan, N. K., & Kabeer, V. (2010). Face recognition using non-linear feature parameter and artificial neural network. International Journal of Computational Intelligent Systems, 3(5), 566–574.

    Google Scholar 

  • Ott, E. (1993). Chaos in dynamical systems. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Packard, N. H., Crutchfield, J. P., Farmer, J. D., & Shaw, R. S. (1980). Geometry from a time series. Physical Review Letters, 45, 712–716.

    Article  Google Scholar 

  • Pal, S. K., & Mitra, S. (1992). Multilayer perceptron, fuzzy sets, and classification. IEEE Transactions on Neural Networks, 3(5), 683–697.

    Article  Google Scholar 

  • Patil, H. A., & Basu, T. K. (2008). LP spectra vs. mel spectra for identification of professional mimics in Indian languages. International Journal of Speech Technology, 11, 1–16.

    Article  Google Scholar 

  • Pernkopf, F. (2005). Bayesian network classifiers versus selective k-NN classifier. Pattern Recognition, 38, 1–10.

    Article  MATH  Google Scholar 

  • Prajith, P. (2008). Investigations on the applications of dynamical instabilities and deterministic chaos for speech signal processing. PhD Thesis, University of Calicut.

  • Rabiner, L., & Juang, B. (1992). Fundamentals of speech recognition. Upper Saddle River: Pearson Education.

    Google Scholar 

  • Ramachandran, H. P. (2008). Encyclopedia of language and linguistics. Oxford: Pergamon Press.

    Google Scholar 

  • Ray, A. K., & Chatterjee, B. (1984). Design of a nearest neighbor classifier system for Bengali character recognition. Journal of the Institution of Electronics and Telecommunication Engineers, 30, 226–229.

    Google Scholar 

  • Ripley, B. D. (1996). Pattern recognition and neural networks. Cambridge: Cambridge University Press.

    MATH  Google Scholar 

  • Samouelian, A. (1994). Knowledge based approach to consonant recognition. In IEEE international conf. on ASSP (pp. 77–80).

    Google Scholar 

  • Senthil, R. G., & Dandapt, S. (2010). Speaker recognition under stressed condition. International Journal of Speech Technology, 13, 141–161.

    Article  Google Scholar 

  • Sheikhzadeh, H., & Deng, L. (1994). Waveform-based speech recognition using hidden filter models: parameter selection and sensitivity to power normalization. IEEE Transactions on Acoustics, Speech, and Signal Processing, 2, 80–91.

    Google Scholar 

  • Simpson, P. K. (1990). Artificial neural systems. Oxford: Pergamon.

    Google Scholar 

  • Takens, F. (1980). Detecting strange attractors in turbulence. In Proceedings. Dynamical systems and turbulence (pp. 366–381), Warwick, UK.

    Google Scholar 

  • Teager, H. M., & Teager, S. M. (1990). Evidence for nonlinear sound production mechanisms in the vocal tract. In Proceedings NATO ASI speech production speech modeling (pp. 241–261).

    Google Scholar 

  • Tou, J. T., & Gonzalez, R. C. (1974). Pattern recognition principles. London: Addison-Wesley.

    MATH  Google Scholar 

  • Whitney, H. (1936). Differentiable manifolds. Annals of Mathematics, 37, 645–680.

    Article  MathSciNet  Google Scholar 

  • Yu, M.-C. (2011). Multi-criteria ABC analysis using artificial-intelligence based classification techniques. Elsevier Expert Systems with Applications, 38, 3416–3421.

    Article  Google Scholar 

  • Zhang, B. Srihari, S. N. (2004). Fast k-nearest neighbor using cluster based trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26(4), 525–528.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to T. M. Thasleema.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Thasleema, T.M., Prajith, P. & Narayanan, N.K. Time–domain non-linear feature parameter for consonant classification. Int J Speech Technol 15, 227–239 (2012). https://doi.org/10.1007/s10772-012-9136-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-012-9136-6

Keywords

Navigation