Whispered Speech Database: Design, Processing and Application

Marković, Branko; Jovic̆ić, Slobodan T.; Galić, Jovan; Grozdić, Đorđe

doi:10.1007/978-3-642-40585-3_74

Branko Marković²⁰,
Slobodan T. Jovic̆ić^21,22,
Jovan Galić²³ &
…
Đorđe Grozdić^21,22

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8082))

Included in the following conference series:

International Conference on Text, Speech and Dialogue

Abstract

This paper presents creation of a whispered speech database Whi-Spe for Serbian language. The database has been collected in order to investigate how well the whisper is used by humans in intelligible verbal communication and how well whispered information can be used in human-computer communication. The database consists of 50 isolated words. They are generated by ten speakers (five male and five female). Each of them pronounced this vocabulary ten times in two modes: normal and whispered. So, the database contains 5.000 pairs of normal/whispered pronunciations. Database evaluation was performed by an analysis of specific manifestations in whispered articulation. Finally, the preliminary results in whispering recognition by using of HMM, ANN and DTW techniques are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Noise Robust Whisper Features for Dysarthric Severity-Level Classification

Whispered speech recognition based on gammatone filterbank cepstral coefficients

Article 22 November 2017

Use of Median Timbre Features for Speaker Identification of Whispering Sound

References

Ito, T., Takeda, K., Itakura, F.: Analysis and Recognition of Whispered speech. Speech Communication 45, 129–152 (2005)
Article Google Scholar
Catford, J.C.: Fundamental problems in phonetics. Edinburgh University Press, Edinburgh (1977)
Google Scholar
Matsuda, M., Kasuya, H.: Acoustic nature of the whisper. In: Proc. Eurospeech 1999, vol. 1, pp. 137–140 (1999)
Google Scholar
Jovičić, S.T., Šarić, Z.M.: Acoustic analysis of consonants in whispered speech. Journal of Voice 22(3), 263–274 (2008)
Article Google Scholar
Zhang, C., Hansen, J.H.L.: Analysis and classification of Speech Mode: Whisper through Shouted. In: Interspeech 2007, pp. 2289–2292 (2007)
Google Scholar
Jovičić, S.T.: Formant feature differences between whispered and voiced sustained vowels. ACUSTICA - Acta Acoustica 84(4), 739–743 (1998)
Google Scholar
Jou, S.C., Schultz, T., Waibel, A.: Whispery speech recognition using adapted articulatory features. In: ICASSP 2005, Paper SP-P15 (2005)
Google Scholar
Zhang, C., Hansen, J.H.L.: Whisper-Island Detection Based on Unsupervised Segmentation With Entropy-Based Speech Feature Processing. IEEE Transactions on Audio, Speech, and Language Processing 19(4), 883–894 (2011)
Article Google Scholar
Fan, X., Hansen, J.H.L.: Speaker identification within Whispered Speech Audio Stream. IEEE Transactions on Audio, Speech and Language Processing 19(5), 1408–1421 (2011)
Article Google Scholar
Sundberg, J., Scherer, R., Hess, M., Müller, F.: Whispering-A Single-Subject Study of Glottal Configuration and Aerodynamics. Journal of Voice 24(5), 574–584 (2010)
Article Google Scholar
Tsunoda, K., Sekimoto, S., Baer, T.: Brain Activity in Aphonia After a Coughing Episode: Different Brain Activity in Healthy Whispering and Pathological Aphonic Conditions. Journal of Voice 26(5), 668.e11–668.e13 (2012)
Google Scholar
Sharifzadeh, H.R., McLoughlin, I.V., Ahamdi, F.: Voiced Speech from Whispers for Post-Laryngectomised Patients. IAENG International Journal of Computer Science, IJCS-36-4-13 (November 19, 2009) (advance online publication)
Google Scholar
Rubin, A.D., Praneetvatakul, V., Gherson, S., Moyer, C.A., Sataloff, R.: Laryngeal hyperfunction during whispering: reality or myth? Journal of Voice 20, 121–127 (2004)
Article Google Scholar
Jovičić, S.T., Kašić, Z., Djordjević, M., Rajković, M.: Serbian emotional speech database: design, processing and evaluation. In: SPECOM 2004, St. Petersburg, Russia, pp. 77–81 (2004)
Google Scholar
Jovičić, S.T., Punišić, S., Šarić, Z.: Time-frequency detection of stridence in fricatives and affricates. In: Int. Conf. Acoustics 2008, Paris, pp. 5137–5141 (2008)
Google Scholar
Jakovljević, N., Pekar, D.: Description of Training Procedure for AlfaNum Continuous Speech Recognition System. In: EUROCON 2005, pp. 1646–1649 (2005)
Google Scholar
Demuth, H., Beale, M.: Neural Network Toolbox User’s Guide. The MathWorks, Inc. (2002)
Google Scholar
Marković, B.: Call by voice - the feature of a mobile telephone, MS work, School of Electrical Engineering, Belgrade University (2004) (in Serbian)
Google Scholar

Download references

Author information

Authors and Affiliations

Computing and Information Technology Department, C̆ac̆ak Technical College, C̆ac̆ak, Serbia
Branko Marković
School of Electrical Engineering, Telecommunications Department, University of Belgrade, Belgrade, Serbia
Slobodan T. Jovic̆ić & Đorđe Grozdić
Laboratory for Psychoacoustics and Speech Perception, Life Activities Advancement Center, Belgrade, Serbia
Slobodan T. Jovic̆ić & Đorđe Grozdić
Faculty of Electrical Engineering, Department of Electronics and Telecommunications, University of Banja Luka, Banja Luka, Bosnia and Herzegovina
Jovan Galić

Authors

Branko Marković
View author publications
You can also search for this author in PubMed Google Scholar
Slobodan T. Jovic̆ić
View author publications
You can also search for this author in PubMed Google Scholar
Jovan Galić
View author publications
You can also search for this author in PubMed Google Scholar
Đorđe Grozdić
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of West Bohemia, 306 14, Pilsen, Czech Republic
Ivan Habernal & Václav Matoušek &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marković, B., Jovic̆ić, S.T., Galić, J., Grozdić, Đ. (2013). Whispered Speech Database: Design, Processing and Application. In: Habernal, I., Matoušek, V. (eds) Text, Speech, and Dialogue. TSD 2013. Lecture Notes in Computer Science(), vol 8082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40585-3_74

Download citation

DOI: https://doi.org/10.1007/978-3-642-40585-3_74
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40584-6
Online ISBN: 978-3-642-40585-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics