Skip to main content

Study on the Effect of Face Masks on Forensic Speaker Recognition

  • Conference paper
  • First Online:
Information and Communications Security (ICICS 2022)

Abstract

The COVID-19 pandemic has led to a dramatic increase in the use of face masks. Face masks can affect both the acoustic properties of the signal and the speech patterns and have undesirable effects on automatic speech recognition systems as well as on forensic speaker recognition and identification systems. This is because the masks introduce both intrinsic and extrinsic variability into the audio signals. Moreover, their filtering effect varies depending on the type of mask used. In this paper we explore the impact of the use of different masks on the performance of an automatic speaker recognition system based on Mel Frequency Cepstral Coefficients to characterise the voices and on Support Vector Machines to perform the classification task. The results show that masks slightly affect the classification results. The effects vary depending on the type of mask used, but not as expected, as the results with FPP2 masks are better than those with surgical masks. An increase in speech intensity has been found with the FPP2 mask, which is related to the increased vocal effort made to counteract the effects of hearing loss.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In Spanish, only the first two formants, F1 and F2, have the characteristics that make the difference between one vowel sound and another. This is due to the relationship between the location of the formants in the spectrogram and the position of the organs involved in articulation [27].

  2. 2.

    According to Delgado-Romero [8], “a control sample is one that belongs to a known subject, while a recovered sample is anonymous, i.e. the identity of the person who carried it out is not known”.

  3. 3.

    The corpus repository and the ASR system are available at: https://tinyurl.com/8h8dteuu.

References

  1. Atcherson, S.R., et al.: The effect of conventional and transparent surgical masks on speech understanding in individuals with and without hearing loss. J. Am. Acad. Audiol. 28, 58–67 (2017)

    Article  Google Scholar 

  2. Audacity Team: Audacity (R): Free audio editor and recorder [computer application] (2022). www.audacityteam.org/

  3. Boersma, P., Weenink, D.: Praat: doing phonetics by computer [computer program] (version 6.2.10) (2009). www.praat.org. Accessed 17 Mar 2022

  4. Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011). https://doi.org/10.1145/1961189.1961199

    Article  Google Scholar 

  5. Coniam, D.: The impact of wearing a face mask in a high-stakes oral examination: an exploratory post-SARS study in Hong Kong. Lang. Assess. Q.: Int. J. 2, 235–261 (2005)

    Article  Google Scholar 

  6. Corey, R.M., Jones, U., Singer, A.C.: Acoustic effects of medical, cloth, and transparent face masks on speech signals. J. Acoust. Soc. Am. 148, 2371–2375 (2020). https://doi.org/10.1121/10.0002279

    Article  Google Scholar 

  7. Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1023/A:1022627411411

    Article  MATH  Google Scholar 

  8. Delgado-Romero, C.: La Identificación de Locutores en el Ámbito Forense (in Spanish). Ph.D. thesis, Departamento de Comunicación y Publicidad II. Facultad de Ciencias de la Información. Universidad Complutense de Madrid. España (2001)

    Google Scholar 

  9. Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-Time Processing of Speech Signals. Institute of Electrical and Electronics Engineers, New York (2015)

    Google Scholar 

  10. ENFSI: Forensic speech and audio analysis working group terms of reference for forensic speaker analysis. European Network of Forensic Science Institutes, pp. 1–4 (2008)

    Google Scholar 

  11. Leu, F.Y., Lin, G.L.: An MFCC-based speaker identification system. In: IEEE 31st International Conference on Advanced Information Networking and Applications, AINA, pp. 1055–1062. Institute of Electrical and Electronics Engineers Inc. (2017). https://doi.org/10.1109/AINA.2017.130

  12. Maher, R.C.: Principles of Forensic Audio Analysis. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-99453-6

    Book  Google Scholar 

  13. McFee, B., et al.: Thassilo: librosa/librosa: 0.9.1 (2022). https://doi.org/10.5281/zenodo.6097378

  14. McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, pp. 18–24 (2015). https://doi.org/10.25080/majora-7b98e3ed-003

  15. Mendel, L.L., Gardino, J.A., Atcherson, S.R.: Speech understanding using surgical masks: a problem in health care? J. Am. Acad. Audiol. 19, 686–695 (2008)

    Article  Google Scholar 

  16. Nguyen, D.D., et al.: Acoustic voice characteristics with and without wearing a facemask. Sci. Rep. 11, 1–11 (2021). https://doi.org/10.1038/s41598-021-85130-8

    Article  Google Scholar 

  17. Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)

    MathSciNet  MATH  Google Scholar 

  18. Pörschmann, C., Lübeck, T., Arend, J.M.: Impact of face masks on voice radiation. J. Acoust. Soc. Am. 148, 3663–3670 (2020). https://doi.org/10.1121/10.0002853

    Article  Google Scholar 

  19. Radonovich, L.J., Jr., Yanke, R., Cheng, J., Bender, B.: Diminished speech intelligibility associated with certain types of respirators worn by healthcare workers. J. Occup. Environ. Hyg. 7, 63–70 (2009)

    Article  Google Scholar 

  20. Randazzo, M., Koenig, L.L., Priefer, R.: The effect of face masks on the intelligibility of unpredictable sentences. In: Proceedings of Meetings on Acoustics, vol. 42 (2020). https://doi.org/10.1121/2.0001374

  21. Rao, K.S., Vuppala, A.K.: Speech Processing in Mobile Environments. SECE, Springer, heidelberg (2014). https://doi.org/10.1007/978-3-319-03116-3

    Book  Google Scholar 

  22. Ratha, N.K., Connell, J.H., Bolle, R.M.: Enhancing security and privacy in biometrics-based authentication systems. IBM Syst. J. 40(3), 614–634 (2001)

    Article  Google Scholar 

  23. Ribeiro, V., Dassie-Leite, A.P., Pereira, E.C., Santos, A.D.N., Martins, P., de Irineu, R.: Effect of wearing a face mask on vocal self-perception during a pandemic. J. Voice (2020)

    Google Scholar 

  24. Saeidi, R., Huhtakallio, I., Alku, P.: Analysis of face mask effect on speaker recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 08, pp. 1800–1804 (2016). https://doi.org/10.21437/Interspeech.2016-518

  25. Saeidi, R., Niemi, T., Karppelin, H., Pohjalainen, J., Kinnunen, T., Alku, P.: Speaker recognition for speech under face cover. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-January, pp. 1012–1016 (2015). https://doi.org/10.21437/interspeech.2015-275

  26. Saleem, S., Subhan, F., Naseer, N., Bais, A., Imtiaz, A.: Forensic speaker recognition: a new method based on extracting accent and language information from short utterances. Forensic Sci. Int.: Digital Invest. 34, 300982 (2020)

    Google Scholar 

  27. Sánchez-López, D.: Análisis acústico y sonográfico de la vocal /a/ para su aplicación en el ámbito de las ciencias forenses (2016). https://tinyurl.com/h5ncwpv. (in Spanish)

  28. Wainer, J., Fonseca, P.: How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artif. Intell. Rev. 54, 4771–4797 (2021)

    Article  Google Scholar 

  29. Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)

    Article  Google Scholar 

Download references

Acknowledgement

This research was supported by the Research Grants Program of the Universidad de Alcalá. We acknowledge the valuable counsel and resources provided by G. A. Acha Ruiz, as well as to the Department of Forensic Acoustics of the “Comisaría General de Policía Científica” for the access to the LOCUPOL database sentences.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hilario Gómez-Moreno .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bogdanel, G., Belghazi-Mohamed, N., Gómez-Moreno, H., Lafuente-Arroyo, S. (2022). Study on the Effect of Face Masks on Forensic Speaker Recognition. In: Alcaraz, C., Chen, L., Li, S., Samarati, P. (eds) Information and Communications Security. ICICS 2022. Lecture Notes in Computer Science, vol 13407. Springer, Cham. https://doi.org/10.1007/978-3-031-15777-6_33

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15777-6_33

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15776-9

  • Online ISBN: 978-3-031-15777-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics