Abstract
In this paper, a novel subspace projection approach is proposed for analysis of speech signal under stressed condition. The subspace projection method is based on the assumption of orthogonality between speech subspace and stress subspace. Speech and stress subspaces contain speech and stress information, respectively. The projection of stressed speech vectors onto the speech subspace will separate speech-specific information. In this work, the speech subspace consists of neutral speech vectors. Speech and stress recognition techniques are used to verify the orthogonal relation between speech and stress subspaces. The evaluation database consists of 119 word vocabulary under neutral, angry, sad and Lombard conditions. Hidden Markov models for speech and stress recognition are used with mel-frequency cepstral coefficient features for evaluation of estimated speech and stress information.






Similar content being viewed by others
References
M. Afify, Y. Gong, J.P.A. Haton, A general additive and convolutive bias compensation approach applied to noisy Lombard speech recognition. IEEE Trans. Speech Audio Process. 6, 524–538 (1998)
R.S. Bolia, R.E. Slyh, Perception of stress and speaking style for selected elements of the SUSAS database. Speech Commun. 40, 493–501 (2003)
A. Borowicz, A signal subspace approach to spatio-temporal prediction for multichannel speech enhancement. EURASIP J. Audio Speech Music Process. A (2015). doi:10.1186/s13636-015-0051-z
Y. Chen, Cepstral domain talker stress compensation for robust speech recognition. IEEE Trans. Acoust. Speech Signal Process 36, 433–439 (1988)
Y. Ephraim, H.L.V. Trees, A signal subspace approach for speech enhancement. IEEE Trans. Speech Audio Process. 3, 251–266 (1995)
J.H.L. Hansen, Morphological constrained feature enhancement with adaptive cepstral compensation (MCE-ACC) for speech recognition in noise and Lombard effect. IEEE Trans. Speech Audio Process. 2, 598–614 (1994)
J.H.L. Hansen, E. Sahar, HMM-based stressed speech modeling with application to improved synthesis and recognition of isolated speech under stress. IEEE Trans. Speech Audio Process. 4, 201–216 (1998)
J. Huang, Y. Zhao, Energy-constrained signal subspace method for speech enhancement and recognition. IEEE Signal Process. Lett. 4, 283–285 (1997)
H. Lev-Ari, Y. Ephraim, Extension of the signal subspace speech enhancement approach to colored noise. IEEE Signal Process. Lett. 10, 104–106 (2003)
Y. Linde, A. Buzo, R.M. Gray, An introduction for vector quantizer design. IEEE Trans. Commun. 28, 84–95 (1980)
R.P. Lippmann, E.A. Mack, D.B. Paul, Multi-style training for robust isolated-word speech recognition, in Proceedings of IEEE ICASSP 1987 (1987), pp. 705–708
S. Ramamohan, S. Dandapat, Sinusoidal model based analysis and classification of stressed speech. IEEE Trans. Audio Speech Lang. Process. 14, 737–746 (2006)
S. Shukla, S. Dandapat, S.R.M. Prasanna, Subspace projection based analysis of speech under stressed condition, in IEEE Processing on WICT, ed. by A. Abraham, S.M. Thampi, S. Pal, E. Corchado, V. Snasel, S. Abraham, S. Ramakrishan (IEEE, Trivandrum, India, 2012)
S. Shukla, S. Dandapat, S.R.M. Prasanna, Spectral slope based analysis and classification of stressed speech. Int. J. Speech Technol. 14, 245–258 (2011)
S. Shukla, S.R.M. Prasanna, S. Dandapat, Stressed speech processing: human vs automatic in non-professional speakers scenario, in IEEE Proceedings on NCC 2011, Bangalore (2011)
H.J.M. Steeneken, J.H.L. Hansen, Speech under stress conditions: overview of the effect on speech production and on system performance, in Proceedings on International Conference on Acoustics, Speech and Signal Processing, Phoenix, Arizona (1999), pp. 2079–2082
G. Strang, Linear Algebra and its Applications, 4th edn. (Cengage Learing, Boston, 2006)
K.Y. Su, C.H. Lee, Speech recognition using weighted HMM and subspace projection approaches. IEEE Trans. Speech Audio Process. 2, 69–79 (1994)
A.W.C. Tan, M.V.C. Rao, B.S.D. Sagar, A signal subspace approach for speech modelling and classification. Speech Commun. 87, 500–508 (2007)
R. Tong, G. Bao, Z. Ye, A higher order subspace algorithm for multichannel speech enhancement. IEEE Signal Process. Lett. 22, 2004–2008 (2015)
D. Ververidis, C. Kotropoulos, Emotional speech recognition: resources, features, and methods. Speech Commun. 48, 1162–1181 (2006)
B.D. Womack, J.H.L. Hansen, Classification of speech under stress using target driven features. Speech Commun. 20, 131–150 (1996)
G. Zhou, J.H.L. Hansen, Nonlinear feature based classification of speech under stress. IEEE Trans. Speech Audio Process. 9, 201–216 (2001)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shukla, S., Dandapat, S. & Prasanna, S.R.M. A Subspace Projection Approach for Analysis of Speech Under Stressed Condition. Circuits Syst Signal Process 35, 4486–4500 (2016). https://doi.org/10.1007/s00034-016-0284-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00034-016-0284-9