Statistical feature evaluation for classification of stressed speech

Patro, H.; Senthil Raja, G.; Dandapat, S.

doi:10.1007/s10772-009-9021-0

Statistical feature evaluation for classification of stressed speech

Published: 19 February 2009

Volume 10, pages 143–152, (2007)
Cite this article

International Journal of Speech Technology Aims and scope Submit manuscript

H. Patro¹,
G. Senthil Raja¹ &
S. Dandapat¹

We’re sorry, something doesn't seem to be working properly.

Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

The variations in speech production due to stress have an adverse affect on the performances of speech and speaker recognition algorithms. In this work, different speech features, such as Sinusoidal Frequency Features (SFF), Sinusoidal Amplitude Features (SAF), Cepstral Coefficients (CC) and Mel Frequency Cepstral Coefficients (MFCC), are evaluated to find out their relative effectiveness to represent the stressed speech. Different statistical feature evaluation techniques, such as Probability density characteristics, F-ratio test, Kolmogorov-Smirnov test (KS test) and Vector Quantization (VQ) classifier are used to assess the performances of the speech features. Four different stressed conditions, Neutral, Compassionate, Anger and Happy are tested. The stressed speech database used in this work consists of 600 stressed speech files which are recorded from 30 speakers. SAF shows maximum recognition result followed by SFF, MFCC and CC respectively with the VQ classifier. The relative classification results and the relative magnitudes of F-ratio values for SFF, MFCC and CC features are obtained with the same order. SFF and MFCC feature show consistent relative performance for all the three tests, F-ratio, K-S test and VQ classifier.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Atal, B. S. (1976). Automatic recognition of speakers from their voices. Proc. IEEE, 64(4), 460–476.
Article Google Scholar
Bhatti, M., Wang, Y., & Guan, L. (2004). A neural network approach for human emotion recognition in speech. In ICSAS’04, proc. of IEEE (pp. 181–184) 2004.
Bou-Ghazale, S. E., & Hansen, J. H. L. (2000). A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Transactions on Speech and Audio Processing, 8(4), 429–442.
Article Google Scholar
Campbell, J. P., Jr. (1997). Speaker recognition: a tutorial. Proc. IEEE, 85(9), 1437–1462.
Article Google Scholar
De Silva, L. C., & NgFourth, P. C. (2000). Bimodal emotion recognition. In IEEE international conference on automatic face and gesture recognition (pp. 332–335) Mar. 2000.
Hansen, J. H. L., & Womack, B. (1996). Feature analysis and neural network based classification of speech under stress. IEEE Transactions on Speech and Audio Processing, 4(4), 307–313.
Article Google Scholar
Hansen, J. H. L., Womack, B., & Arsian, L. M. (1994). A source generator based production model for environmental robustness in speech recognition. In International conference on spoken language processing (ICSLP) (pp. 1003–1006) 1994.
Jensen, J., & Hansen, J. H. L. (2001). Speech enhancement using a constrained iterative sinusoidal model. IEEE Transactions on Speech and Audio Processing, 9(7), 731–740.
Article Google Scholar
McAulay, R., & Quatieri, T. (1986). Speech analysis/synthesis based on a sinusoidal representation. IEEE Transactions on Acoustics, Speech, and Signal Processing, Assp-34(4), 744–754.
Article Google Scholar
Nwe, T. L., Foo, S. W., & De Silva, L. C. (2003). Detection of stress and emotion in speech using traditional and FFT based log energy features. In Fourth Pacific rim conference on multimedia, information, communications & signal processing (Vol. 3, pp. 1619–1623) Dec. 2003.
O’Shaughnessy, D. (1986). Speaker recognition. ASSP Magazine, 3(4), 4–17, Part 1.
Article Google Scholar
Press, W., Teukolsky, S., & Vetterling, W. (1992). Flannery, numerical recipes in C. Cambridge: Cambridge University Press.
Google Scholar
Rabiner, L., & Juang, B. H. (1993). Fundamentals of speech recognition. Englewood Cliffs: Prentice Hall, 07632.
Google Scholar
Ramamohan, S., & Dandapat, S. (2002). Feature analysis for classification of speech under stress. In The Indo-European conference on multilingual communication technologies (IEMCT), Pune, June 2002.
Ramamohan, S., & Dandapat, S. (2006). Sinusoidal model based analysis and classification of stressed speech. IEEE Transactions on Speech and Audio Processing, 14(3), 737–746.
Article Google Scholar
Sathyanarayana, N., Dandapat, S., & Sahambi, J. S. (2001). Stressed speech analysis using sinusoidal model. In International conference on energy, automation and information technology, Indian Institute of Technology, Kharagpur, India (pp. 10–12) Dec. 2001.
Sato, J., & Morishma, S. (1996). Emotion modeling in speech production using emotion space. In IEEE international workshop on robot and human communication (pp. 472–477) Sep. 1996.
Schuller, B., Rigoll, G., & Lang, M. (2004). Speech emotion recognition combining acoustic features and linguistic information in a hybrid support vector machine—belief network architecture. In International conference on acoustics, speech and signal processing (ICASSP) (pp. 577–580) 2004.

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Indian Institute of Technology Guwahati, Guwahati, 781039, Assam, India
H. Patro, G. Senthil Raja & S. Dandapat

Authors

H. Patro
View author publications
You can also search for this author inPubMed Google Scholar
G. Senthil Raja
View author publications
You can also search for this author inPubMed Google Scholar
S. Dandapat
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to G. Senthil Raja.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Patro, H., Senthil Raja, G. & Dandapat, S. Statistical feature evaluation for classification of stressed speech. Int J Speech Technol 10, 143–152 (2007). https://doi.org/10.1007/s10772-009-9021-0

Download citation

Received: 22 June 2005
Accepted: 09 February 2009
Published: 19 February 2009
Issue Date: September 2007
DOI: https://doi.org/10.1007/s10772-009-9021-0

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Statistical feature evaluation for classification of stressed speech

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Breathy, Emergency and Pathological Stress Classes

Stress Identification from Speech Using Clustering Techniques

Speech Features for Discriminating Stress Using Branch and Bound Wrapper Search

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Statistical feature evaluation for classification of stressed speech

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Analysis of Breathy, Emergency and Pathological Stress Classes

Stress Identification from Speech Using Clustering Techniques

Speech Features for Discriminating Stress Using Branch and Bound Wrapper Search

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now