Abstract
This paper presents creation of a whispered speech database Whi-Spe for Serbian language. The database has been collected in order to investigate how well the whisper is used by humans in intelligible verbal communication and how well whispered information can be used in human-computer communication. The database consists of 50 isolated words. They are generated by ten speakers (five male and five female). Each of them pronounced this vocabulary ten times in two modes: normal and whispered. So, the database contains 5.000 pairs of normal/whispered pronunciations. Database evaluation was performed by an analysis of specific manifestations in whispered articulation. Finally, the preliminary results in whispering recognition by using of HMM, ANN and DTW techniques are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Ito, T., Takeda, K., Itakura, F.: Analysis and Recognition of Whispered speech. Speech Communication 45, 129–152 (2005)
Catford, J.C.: Fundamental problems in phonetics. Edinburgh University Press, Edinburgh (1977)
Matsuda, M., Kasuya, H.: Acoustic nature of the whisper. In: Proc. Eurospeech 1999, vol. 1, pp. 137–140 (1999)
Jovičić, S.T., Šarić, Z.M.: Acoustic analysis of consonants in whispered speech. Journal of Voice 22(3), 263–274 (2008)
Zhang, C., Hansen, J.H.L.: Analysis and classification of Speech Mode: Whisper through Shouted. In: Interspeech 2007, pp. 2289–2292 (2007)
Jovičić, S.T.: Formant feature differences between whispered and voiced sustained vowels. ACUSTICA - Acta Acoustica 84(4), 739–743 (1998)
Jou, S.C., Schultz, T., Waibel, A.: Whispery speech recognition using adapted articulatory features. In: ICASSP 2005, Paper SP-P15 (2005)
Zhang, C., Hansen, J.H.L.: Whisper-Island Detection Based on Unsupervised Segmentation With Entropy-Based Speech Feature Processing. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 883–894 (2011)
Fan, X., Hansen, J.H.L.: Speaker identification within Whispered Speech Audio Stream. IEEE Transactions on Audio, Speech and Language Processing 19(5), 1408–1421 (2011)
Sundberg, J., Scherer, R., Hess, M., Müller, F.: Whispering-A Single-Subject Study of Glottal Configuration and Aerodynamics. Journal of Voice 24(5), 574–584 (2010)
Tsunoda, K., Sekimoto, S., Baer, T.: Brain Activity in Aphonia After a Coughing Episode: Different Brain Activity in Healthy Whispering and Pathological Aphonic Conditions. Journal of Voice 26(5), 668.e11–668.e13 (2012)
Sharifzadeh, H.R., McLoughlin, I.V., Ahamdi, F.: Voiced Speech from Whispers for Post-Laryngectomised Patients. IAENG International Journal of Computer Science, IJCS-36-4-13 (November 19, 2009) (advance online publication)
Rubin, A.D., Praneetvatakul, V., Gherson, S., Moyer, C.A., Sataloff, R.: Laryngeal hyperfunction during whispering: reality or myth? Journal of Voice 20, 121–127 (2004)
Jovičić, S.T., Kašić, Z., Djordjević, M., Rajković, M.: Serbian emotional speech database: design, processing and evaluation. In: SPECOM 2004, St. Petersburg, Russia, pp. 77–81 (2004)
Jovičić, S.T., Punišić, S., Šarić, Z.: Time-frequency detection of stridence in fricatives and affricates. In: Int. Conf. Acoustics 2008, Paris, pp. 5137–5141 (2008)
Jakovljević, N., Pekar, D.: Description of Training Procedure for AlfaNum Continuous Speech Recognition System. In: EUROCON 2005, pp. 1646–1649 (2005)
Demuth, H., Beale, M.: Neural Network Toolbox User’s Guide. The MathWorks, Inc. (2002)
Marković, B.: Call by voice - the feature of a mobile telephone, MS work, School of Electrical Engineering, Belgrade University (2004) (in Serbian)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Marković, B., Jovic̆ić, S.T., Galić, J., Grozdić, Đ. (2013). Whispered Speech Database: Design, Processing and Application. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_74
Download citation
DOI: https://doi.org/10.1007/978-3-642-40585-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)