Skip to main content
Log in

Feature analysis of pathological speech signals using local discriminant bases technique

  • Published:
Medical and Biological Engineering and Computing Aims and scope Submit manuscript

Abstract

Speech is an integral part of the human communication system. Various pathological conditions affect the vocal functions, inducing speech disorders. Acoustic parameters of speech are commonly used for the assessment of speech disorders and for monitoring the progress of the patient over the course of therapy. In the last two decades, signal-processing techniques have been successfully applied in screening speech disorders. In the paper, a novel approach is proposed to classify pathological speech signals using a local discriminant bases (LDB) algorithm and wavelet packet decompositions. The focus of the paper was to demonstrate the significance of identifying the signal subspaces that contribute to the discriminatory characteristics of normal and pathological speech signals in a computationally efficient way. Features were extracted from target subspaces for classification, and time-frequency decomposition was used to eliminate the need for segmentation of the speech signals. The technique was tested with a database of 212 speech signals (51 normal and 161 pathological) using the Daubechies wavelet (db4). Classification accuracies up to 96% were achieved for a two-group classification as normal and pathological speech signals, and 74% was achieved for a four-group classification as male normal, female normal, male pathological and female pathological signals.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agbinya, J. I. (1996): ‘Discrete wavelet transform techniques in speech processing’Proc. IEEE TENCON, Digital Signal Processing Applications,2, pp. 514–519

    Google Scholar 

  • Athineos, M., andEllis, D. P. W. (2003): ‘Frequency-domain linear prediction for temporal features’.Proc. IEEE Workshop on Automatic Speech Recognition & Understanding, ASRU '03, pp. 261–266

  • Baken, R. J., andOrlikoff, R. F. (2000): ‘Clinical measurement of speech and voice’ (Singular Publications, SanDiego, CA, 2000)

    Google Scholar 

  • Christian, B. (2002): ‘Local discriminant bases and optimized wavelet to classify ultrasonic echoes: application to indoor mobile robotics’,Proc. IEEE, Sensors,2, pp. 1654–1659

    Google Scholar 

  • Coifman, R. R., andWickerhauser, M. V. (1992): ‘Entropy-based algorithms for best basis selection’,IEEE Trans. Inform. Theory,38, pp. 713–718

    Article  Google Scholar 

  • Database (1994): ‘Voice disorders database, version 1.03 (CDROM)’. Massachusetts Eye & Ear Infirmary,Kay Elemetrics Corporation, Lincoln Park, NJ, USA

  • Eskenazi, L., Childers, D. G., andHicks, D. M. (1990): ‘Acoustics correlates of vocal quality’,J. Speech Hearing Res.,33, pp. 298–306

    Google Scholar 

  • Fukunaga, K. (1990): ‘Introduction to statistical pattern recognition’ (Academic Press, Inc., San Diego, CA, 1990)

    Google Scholar 

  • Godino-Llorente, J. I., andGomez-Vilda, P. (2004): ‘Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors’,IEEE Trans. Biomed. Eng.,51, pp. 380–384

    Article  Google Scholar 

  • Hammarberg, B. (1980): ‘Perceptual and acoustic correlates of abnormal voice qualities’,Acta Otolaryngol. 90, pp. 441–451

    Google Scholar 

  • Hermansky, H., Tsuga, K., Makino, S., andWakita, H. (1986): Perceptually based processing in automatic speech recognition. InProc. of the International Conference on Acoustics, Speech, and Signal Processing ICASSP '86. pp. 1971–1974

  • Hillman, R. E., Qui, Y., andMilstein, C. (1999): ‘The estimation of signal-to-noise ratio in continuous speech for disordered voices’,J. Acoust. Soc. Am.,105, pp. 2532–2535

    Google Scholar 

  • Hsieh, C. T., Lai, E., andWang, Y. C. (2002): ‘Robust speech features based on wavelet transform with application to speaker identification’,IEE Proc. Vis. Image Signal Process.,149, pp. 108–114

    Article  Google Scholar 

  • Juang, B. H., andChen, T. (1998): ‘The past, present and future of speech processing’,IEEE Signal Process. Mag.,15, pp. 28–48

    Article  Google Scholar 

  • Kadambe, S., andSrinivasan, P. (1994): ‘Application of adaptive wavelets for speech coding’.Proc. IEEE-SP Int. Symposium on Time-frequency and Time-scale analysis, pp. 632–635

  • Klingholz, F. (1990): ‘Acoustic recognition of voice disorders: A comparative study, running speech versus sustained vowels’,J. Acoust. Soc. Am.,87, pp. 2218–2224

    Google Scholar 

  • Krishnan, S., Rangayyan, R. M., Bell, G. D., Frank, C. B., andLadly, K. O. (1996): ‘Screening of knee joint vibroarthrographic signals by statistical pattern analysis of dominant poles’.Proc. Int. Conf. on Engineering in Medicine & Biology-Bridging Disciplines for Biomedicine, pp. 968–969

  • Long, C. J., andDatta, S. (1996): ‘Wavelet based feature extraction for phoneme recognition’.Proc. Fourth Int. Conf. on Spoken Language, ICSLP, pp. 264–267

  • Lukasik, E. (2000): ‘Wavelet packets based feature selection for voiceless plosives classification’.Proc. Int. Conf. on Acoustics, Speech, & Signal Processing, ICASSP-2000, pp. II689–II692

  • Mallat, S. G., andZhang, Z. (1993): ‘Matching pursuit with time-frequency dictionaries’,IEEE Trans. Signal Process.,41, 3397–3415

    Article  Google Scholar 

  • Mallat, S. (1998): ‘A wavelet tour of signal processing’ (Academic Press, San Diego, CA, 1998)

    Google Scholar 

  • Najih, A. M. M. A., Ramli BIN, A. R., Prakash, V., andSyed, A. R. (2003): ‘Speech compression using discrete wavelet transform’.Proc. 4th National Conf. on Telecommunication Technology, pp. 1–4

  • Parsa, V., andJamieson, D. G. (2000): ‘Identification of pathological voices based on glottal noise measures’,J. Speech Hearing Res.,43, pp. 469–485

    Google Scholar 

  • Parsa, V., andJamieson, D. G. (2001): ‘Acoustic discrimination of pathological voice: Sustained vowels versus continuous speech’.J. Speech, Lang. Hearing Res.,44, pp. 327–339

    Google Scholar 

  • Picone, J. W. (1993): ‘Signal modelling techniques in speech recognition’.Proc. IEEE Trans.,81, pp. 1215–1247

    Google Scholar 

  • Ris, C., Fontaine, V., andLeich, H. (1995): ‘Speech analysis based on Malvar wavelet transform’.Proc. Int. Conf. Acoustics, Speech & Signal Processing ICASSP-95, pp. 389–392

  • Saito, N., &Coifman, R. R., (1995): ‘Local discriminant bases and their applications’,J. Math. Imag. Vis. 5, pp. 337–358

    MathSciNet  Google Scholar 

  • Spooner, C. M. (2001): ‘Application of local discriminant bases to HRR-based ATR’.Proc. Signals, Systems & Computers, 35th Asilomar Conf., pp. 1067–1073

  • Spss (1990): SPSS advanced statistics users guide, in ‘User manual’, (SPSS Inc., Chicago, IL, 1990)

    Google Scholar 

  • Suleesathira, R., andChaparro, L. F. (1998): ‘Evolutionary spectral analysis using Malvar wavelet transform’.Proc. Int. Symposium on Time-frequency & Time-scale analysis, IEEE SP, pp. 673–676

  • Tzanetakis, G., andCook, P. (2002): ‘Musical genre classification of audio signals’,IEEE Trans. Speech Audio Process.,10 pp. 293–302

    Article  Google Scholar 

  • Umapathy, K., Krishnan, S., Parsa, V., andJamieson, D. G. (2002): ‘Time-frequency modeling and classification of pathological voices’.Proc. IEEE Engineering in Medicine & Biology Society (EMBS) 2002 Conf., Houston, Texas, USA, pp. 116–117

  • Umapathy, K., andKrishnan, S. (2004): ‘Modified local discriminant bases and its applications in signal classification’,Proc. Int. Conf. on Acoustics, Speech & Signal Processing 2004 ICASSP 2004,2, pp. 745–748

    Google Scholar 

  • Veselinovic, D. andGraupe, D. (2003): ‘A wavelet transform approach to blind adaptive filtering of speech from unknown noises’,IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.50, pp. 150–154

    Google Scholar 

  • Web, A. (2002):Statistical pattern recognition, (Wiley, West Sussex, England, 2002)

    Google Scholar 

  • Yumato, E., Sasaki, Y., andOkamura, H. (1984): ‘Harmonics-to-noise ratio and psychophysical measurement of the degree of hoarseness’,J. Speech Hearing Res.,27, pp. 2–6

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to S. Krishnan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Umapathy, K., Krishnan, S. Feature analysis of pathological speech signals using local discriminant bases technique. Med. Biol. Eng. Comput. 43, 457–464 (2005). https://doi.org/10.1007/BF02344726

Download citation

  • Received:

  • Accepted:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02344726

Keywords

Navigation