Abstract
Speech production is affected by noise due to the Lombard effect. The traditional method of investigation is through headphone delivery of noise to allow speech to be recorded in quiet, but this could create an occlusion effect artefact during speech production. It is also not directly applicable when wearing hearing protectors, hearing aids, or other devices due to physical interference by the headphones. In these situations, the Lombard effect needs to be elicited by an external noise field and speech recorded in the presence of noise. This is a more challenging measurement situation, but one that preserves perception of own voice and the surrounding noise in interaction with the hearing device worn. Two methods, direct waveform subtraction and adaptive noise cancellation, were evaluated for suppressing the background noise in the recorded speech..The effects of sound recording configuration on performance was investigated for two microphone types (omnidirectional and directional) at two distances (50 and 25 cm) in different noises and in the presence of real talker’s movement. Results show that the amount of noise reduction with both suppression methods is greater for fluctuating than continuous noises. Overall, the best recording configuration for noise reduction was with the omnidirectional microphone at 25 cm. Pitch extraction, energy level, and objective speech intelligibility and quality measures show that both suppression methods provide adequate noise reduction for SNRs as low as − 10 dB, which is suitable to successfully recover Lombard speech produced in an external noise field with open ears and when wearing hearing protectors.
Similar content being viewed by others
References
Abd El-Fattah, M. A., Dessouky, M. I., Abbas, A. M., Alaa, M., Diab, S. M., El-Rabaie, E. M., et al. (2014). Speech enhancement with an adaptive Wiener filter. International Journal of Speech Technology, 17, 53–64.
Bahoura, M., & Rouat, J. (2006). Wavelet speech enhancement based on time-scale adaptation. Speech Communication, 48, 1620–1637.
Beerends, J. G., Helstra, A. P., Rix, A. W., & Hollier, M. P. (2002). Perceptual evaluation of speech quality (PESQ), the new ITU standard for end-to-end speech quality assessment. Part II-Psychoacoustic model. Journal of Audio Engineering Society, 50(10), 765–778.
Bouserhal, R., Macdonald, E. N., Falk, T. H., & Voix, J. (2016). Variations in voice level and fundamental frequency with changing background noise level and talker-to-listener distance while wearing hearing protectors: A pilot study. International Journal of Audiology, 55(Sup1), S13–S20.
Brungart, D., Cord, M. T., Solomon, N. P., Dietrich-Burns, K., & Block, K. (2012). Evaluating the effects of hearing protection on speech production in noisy environments. In ListTalk-2012.
Castellanos, A., Benedi, J. M., & Casacuberta, F. (1996). An analysis of general acoustic-phonetic features for Spanish speech produced with Lombard effect. Speech Communication, 20, 23–35.
Chen, S. H. (2004). Speech enhancement using perceptual wavelet packet decomposition teager energy operator. Journal of VLSI Signal Processing, 36, 125–139.
Davis, C., Kim, J., Grauwinkel, K., & Mixdorff, H. (2006). Lombard speech: Auditory (A), visual (V) and AV effects. 3rd international conference on speech prosody, Dresden.
Dittberner, A. (2003). Interpreting the directivity index (DI). The hearing review, http://www.hearingreview.com/2003/06/interpreting-the-directivity-index-di/.
Drugman, T., & Dutoit, T. (2010). Glottal-based Analysis of the Lombard Effect. 11th annual conference of the international speech communication association (INTERSPEECH), Chiba.
Ferrand, C. T. (2005). Relationship between masking levels and phonatory stability in normal-speaking women. Voice, 20(2), 223–228.
Garnier, M., Bailly, L., Dohen, M., Welby, P., & Loevenbruck, H. (2006a). An acoustic and articulatory study of Lombard speech: Global effect on the utterance. 9th International conference on spoken language processing (INTERSPEECH), Pittsburgh.
Garnier, M., Dohen, M., Loevenbruck, H., Welby, P., & Bailly, L. (2006b). The Lombard effect: A physiological reflex or a controlled intelligibility enhancement. Proceedings of 7th international seminar on speech production.
Garnier, M., & Henrich, N. (2014). Speaking in noise; how does the Lombard effect improve acoustic contrasts between speech and ambient noise? Computer Speech and Language, 28, 580–597.
Giguère, C., Laroche, C., Brault, E., Ste-Marie, J. C., Brosseau-Villeneuve, M., Philippon, B., et al. (2006). Quantifying the Lombard effect in different background noises. The Journal Acoustical Society of America 120, 3378–3378.
Giguère, C., Laroche, C., & Vaillancourt, V. (2010). Modelling speech intelligibility in the noisy workplace for normal-hearing and hearing-impaired listeners using hearing protectors. International Journal of Acoustics and Vibration, 15(4), 156–167.
Goldenberg, R., Cohen, A., & Shallom, I. (2006). The Lombard effect’s influence on automatic speaker verification systems and methods for its compensation. International conference on information technology: research and education.
Gomez, A. M., Schwerin, B., & Paliwal, K. (2012). Improving objective intelligibility prediction by combining correlation and coherence based methods with a measure based on the negative distortion ratio. Speech Communication, 54, 503–515.
Gonzalez, S., & Brookes, M. (2011). A pitch estimation filter robust to high levels of noise (PEFAC). In IEEE/ACM transactions on audio, speech and language processing (TASLP), 22(2), 518–530.
Hodgson, M., Steininger, G., & Razavi, Z. (2007). Measurement and prediction of speech and noise levels and the Lombard effect in eating establishment. Acoustical Society of America, 121(4), 2023–2033.
Holube, I., Fredelake, S., Vlaming, M., & Kollmeier, B. (2010). Development and analysis of an international speech test signal (ISTS). International Journal of Audiology, 49, 891–903.
Hormann, H., Lazarus-Mainka, G., Schubeius, M., & Lazarus, H. (1984). The effects of noise and the wearing of ear protectors on verbal communication. Noise Control Engineering Journal, 23(2), 69–77.
Howard-Jones, P., & Rosen, S. (1993). The perception of speech in fluctuating noise. Acta Acustica united with Acustica, 78, 258–272.
Hu, Y., & Loizou, P. C. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transaction on Audio, Speech, and Language Processing, 16(1), 229–238.
Junqua, J. C. (1996). The influence of acoustics on speech production: A noise-induced stress phenomenon Known as the Lombard effect. Speech Communication, 20, 13–22.
Junqua, J. C., & Anglade, Y. (1990). Acoustic and perceptual studies of Lombard Speech: application to isolated words automatic speech recognition. International conference on acoustics, speech, and signal processing.
Laugesen, S., Nielsen, C., Maas, P., & Jensen, N. S. (2009). Observations on hearing aid users’ strategies for controlling the level of their own voice. Journal of American Academy of Audiology, 20(8), 503–513.
Liu, W. M., Jellyman, K. A., Evans, N. W. D., & Mason, J. S. D. (2006). Assessment of objective quality measures for speech intelligibility. International conference on acoustics, speech processing (ICASSP), Toulouse.
Luke, C., Theib, A., Schmidt, G., Niebuhr, O., & John, T. (2013). Creation of a Lombard speech database using an acoustic ambiance simulation with loudspeakers. 6th Biennial workshop on DSP for in-vehicle systems.
Ma, J., Hu, Y., & Loizou, P. C. (2009). Objective measures for predicting speech intelligibility in noisy condition based on new band-importance functions. The Journal of the Acoustical Society of America, 125(5), 3387–3405.
MacDonald, E. N., & Raufer, S. (2013). Speech perception in amplitude-modulated noise. Proceedings of meeting on acoustics, Montreal.
Nijs, L., Saher, K., & Ouden, D. d. (2008). Effect of room absorption on human vocal output in multitalker situations. The Journal of the Acoustical Society of America, 123(2), 803–813.
Nymand, M. (2015). Directional vs. omnidirectional microphones. DPA MICROPHONES, https://www.dpamicrophones.com/mic-university/directional-vs-omnidirectional-microphones.
O’Shaughnessy, D. (2000). Speech communications: Human and machine. New York: IEEE Press.
Payton, K. L., & Braida, L. D. (1999). A method to determine the speech transmission index from speech waveforms. The Journal of the Acoustical Society of America, 106(6), 3637–3648.
Payton, K. L., & Shrestha, M. (2008). Evaluation of short-time speech-based intelligibility metrics. Foxwoods.
Payton, K. L., & Shrestha, M. (2013). Comparison of a short-time speech-based intelligibility metric to the speech transmission index and intelligibility data. The Journal of the Acoustical Society of America, 134(5), 3818–3827.
Pourmand, N. (2012). Objective and subjective evaluation of wideband speech quality. London: The University of Western Ontario.
Ramli, R. M., Noor, A. O., & Abdul Samad, S. (2012). A review of adaptive line enhancers for noise cancellation. Australian Journal of Basic and Applied sciences, 6(6), 337–352.
Rindel, J. H., & Gade, A. C. (2012). Dynamic sound source for simulating the Lombard effect modeling software. New York: Procedding of Inter Noise.
Taal, C. H., Hendriks, R. C., Heusdens, R., Jensen, J., & Kjems, U. (2009). An evaluation of objective quality measures for speech intelligibility prediction. Proceeding of Interspeech, Brighton.
Tan, L., & Karnjanadecha, M. (2003). Pitch detection algorithm: Autocorrelation method and AMDF. Intelligent signal processing and communication systems (ISPACS), Bangkok.
Ternstrom, S., Sodersten, M., & Bohman, M. (2002). Cancellation of simulated environmental noise as a tool for measuring vocal performance during noise exposure. Journal of Voice, 16(2), 195–206.
Thompson, S. C. (2000). Directional microphone patterns: They also have disadvantages. Audiology Online, https://www.audiologyonline.com/articles/directional-microphone-patterns-they-also-1294.
Tufts, J. B., & Frank, T. (2003). Speech production in noise with and without hearing protection. The Journal of the Acoustical Society of America, 114(2), 1069–1080.
Vaziri, G., Giguère, C., Dajani, H., & Ellaham, N. (2015). A comparison of speech enhancement methods to extract Lombard speech in an external noise field. The Journal of the Acoustical Society of America, 138(3), 1727.
Vermiglio, A. J. (2008). The American english hearing in noise test. International Journal of Audiology, 47, 386–387.
Wakao, A., Takeda, K., & Itakura, F. (1996). Variability of Lombard effects under different noise. Proceedings of the International Conference on Spoken Language Processing (ICSLP).
Zeine, L., & Brandt, J. F. (1988). The Lombard effect on Alaryngeal speech. Journal of Communication Disorder, 21, 373–383.
Zhao, H., & Gan, W. (2013). A new pitch estimation method based on AMDF. Journal of Multimedia, 8(5), 618–625.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Vaziri, G., Giguère, C. & Dajani, H.R. Evaluating noise suppression methods for recovering the Lombard speech from vocal output in an external noise field. Int J Speech Technol 22, 31–46 (2019). https://doi.org/10.1007/s10772-018-09564-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10772-018-09564-8