Study on the Effect of Face Masks on Forensic Speaker Recognition

Bogdanel, Georgiana; Belghazi-Mohamed, Nadia; Gómez-Moreno, Hilario; Lafuente-Arroyo, Sergio

doi:10.1007/978-3-031-15777-6_33

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13407))

Included in the following conference series:

International Conference on Information and Communications Security

1283 Accesses
1 Citations

Abstract

The COVID-19 pandemic has led to a dramatic increase in the use of face masks. Face masks can affect both the acoustic properties of the signal and the speech patterns and have undesirable effects on automatic speech recognition systems as well as on forensic speaker recognition and identification systems. This is because the masks introduce both intrinsic and extrinsic variability into the audio signals. Moreover, their filtering effect varies depending on the type of mask used. In this paper we explore the impact of the use of different masks on the performance of an automatic speaker recognition system based on Mel Frequency Cepstral Coefficients to characterise the voices and on Support Vector Machines to perform the classification task. The results show that masks slightly affect the classification results. The effects vary depending on the type of mask used, but not as expected, as the results with FPP2 masks are better than those with surgical masks. An increase in speech intensity has been found with the FPP2 mask, which is related to the increased vocal effort made to counteract the effects of hearing loss.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
In Spanish, only the first two formants, F1 and F2, have the characteristics that make the difference between one vowel sound and another. This is due to the relationship between the location of the formants in the spectrogram and the position of the organs involved in articulation [27].
2.
According to Delgado-Romero [8], “a control sample is one that belongs to a known subject, while a recovered sample is anonymous, i.e. the identity of the person who carried it out is not known”.
3.
The corpus repository and the ASR system are available at: https://tinyurl.com/8h8dteuu.

References

Atcherson, S.R., et al.: The effect of conventional and transparent surgical masks on speech understanding in individuals with and without hearing loss. J. Am. Acad. Audiol. 28, 58–67 (2017)
Article Google Scholar
Audacity Team: Audacity (R): Free audio editor and recorder [computer application] (2022). www.audacityteam.org/
Boersma, P., Weenink, D.: Praat: doing phonetics by computer [computer program] (version 6.2.10) (2009). www.praat.org. Accessed 17 Mar 2022
Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 1–27 (2011). https://doi.org/10.1145/1961189.1961199
Article Google Scholar
Coniam, D.: The impact of wearing a face mask in a high-stakes oral examination: an exploratory post-SARS study in Hong Kong. Lang. Assess. Q.: Int. J. 2, 235–261 (2005)
Article Google Scholar
Corey, R.M., Jones, U., Singer, A.C.: Acoustic effects of medical, cloth, and transparent face masks on speech signals. J. Acoust. Soc. Am. 148, 2371–2375 (2020). https://doi.org/10.1121/10.0002279
Article Google Scholar
Cortes, C., Vapnik, V.: Support-vector networks. Mach. Learn. 20(3), 273–297 (1995). https://doi.org/10.1023/A:1022627411411
Article MATH Google Scholar
Delgado-Romero, C.: La Identificación de Locutores en el Ámbito Forense (in Spanish). Ph.D. thesis, Departamento de Comunicación y Publicidad II. Facultad de Ciencias de la Información. Universidad Complutense de Madrid. España (2001)
Google Scholar
Deller, J.R., Proakis, J.G., Hansen, J.H.L.: Discrete-Time Processing of Speech Signals. Institute of Electrical and Electronics Engineers, New York (2015)
Google Scholar
ENFSI: Forensic speech and audio analysis working group terms of reference for forensic speaker analysis. European Network of Forensic Science Institutes, pp. 1–4 (2008)
Google Scholar
Leu, F.Y., Lin, G.L.: An MFCC-based speaker identification system. In: IEEE 31st International Conference on Advanced Information Networking and Applications, AINA, pp. 1055–1062. Institute of Electrical and Electronics Engineers Inc. (2017). https://doi.org/10.1109/AINA.2017.130
Maher, R.C.: Principles of Forensic Audio Analysis. Springer, Heidelberg (2018). https://doi.org/10.1007/978-3-319-99453-6
Book Google Scholar
McFee, B., et al.: Thassilo: librosa/librosa: 0.9.1 (2022). https://doi.org/10.5281/zenodo.6097378
McFee, B., et al.: Librosa: audio and music signal analysis in python. In: Proceedings of the 14th Python in Science Conference, pp. 18–24 (2015). https://doi.org/10.25080/majora-7b98e3ed-003
Mendel, L.L., Gardino, J.A., Atcherson, S.R.: Speech understanding using surgical masks: a problem in health care? J. Am. Acad. Audiol. 19, 686–695 (2008)
Article Google Scholar
Nguyen, D.D., et al.: Acoustic voice characteristics with and without wearing a facemask. Sci. Rep. 11, 1–11 (2021). https://doi.org/10.1038/s41598-021-85130-8
Article Google Scholar
Pedregosa, F., et al.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
MathSciNet MATH Google Scholar
Pörschmann, C., Lübeck, T., Arend, J.M.: Impact of face masks on voice radiation. J. Acoust. Soc. Am. 148, 3663–3670 (2020). https://doi.org/10.1121/10.0002853
Article Google Scholar
Radonovich, L.J., Jr., Yanke, R., Cheng, J., Bender, B.: Diminished speech intelligibility associated with certain types of respirators worn by healthcare workers. J. Occup. Environ. Hyg. 7, 63–70 (2009)
Article Google Scholar
Randazzo, M., Koenig, L.L., Priefer, R.: The effect of face masks on the intelligibility of unpredictable sentences. In: Proceedings of Meetings on Acoustics, vol. 42 (2020). https://doi.org/10.1121/2.0001374
Rao, K.S., Vuppala, A.K.: Speech Processing in Mobile Environments. SECE, Springer, heidelberg (2014). https://doi.org/10.1007/978-3-319-03116-3
Book Google Scholar
Ratha, N.K., Connell, J.H., Bolle, R.M.: Enhancing security and privacy in biometrics-based authentication systems. IBM Syst. J. 40(3), 614–634 (2001)
Article Google Scholar
Ribeiro, V., Dassie-Leite, A.P., Pereira, E.C., Santos, A.D.N., Martins, P., de Irineu, R.: Effect of wearing a face mask on vocal self-perception during a pandemic. J. Voice (2020)
Google Scholar
Saeidi, R., Huhtakallio, I., Alku, P.: Analysis of face mask effect on speaker recognition. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 08, pp. 1800–1804 (2016). https://doi.org/10.21437/Interspeech.2016-518
Saeidi, R., Niemi, T., Karppelin, H., Pohjalainen, J., Kinnunen, T., Alku, P.: Speaker recognition for speech under face cover. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, vol. 2015-January, pp. 1012–1016 (2015). https://doi.org/10.21437/interspeech.2015-275
Saleem, S., Subhan, F., Naseer, N., Bais, A., Imtiaz, A.: Forensic speaker recognition: a new method based on extracting accent and language information from short utterances. Forensic Sci. Int.: Digital Invest. 34, 300982 (2020)
Google Scholar
Sánchez-López, D.: Análisis acústico y sonográfico de la vocal /a/ para su aplicación en el ámbito de las ciencias forenses (2016). https://tinyurl.com/h5ncwpv. (in Spanish)
Wainer, J., Fonseca, P.: How to tune the RBF SVM hyperparameters? An empirical evaluation of 18 search algorithms. Artif. Intell. Rev. 54, 4771–4797 (2021)
Article Google Scholar
Wu, Z., Evans, N., Kinnunen, T., Yamagishi, J., Alegre, F., Li, H.: Spoofing and countermeasures for speaker verification: a survey. Speech Commun. 66, 130–153 (2015)
Article Google Scholar

Download references

Acknowledgement

This research was supported by the Research Grants Program of the Universidad de Alcalá. We acknowledge the valuable counsel and resources provided by G. A. Acha Ruiz, as well as to the Department of Forensic Acoustics of the “Comisaría General de Policía Científica” for the access to the LOCUPOL database sentences.

Author information

Authors and Affiliations

Escuela Politécnica Superior, Departamento de Teoría de la Señal y Comunicaciones, Universidad de Alcalá, 28871, Alcalá de Henares, Madrid, Spain
Georgiana Bogdanel, Nadia Belghazi-Mohamed, Hilario Gómez-Moreno & Sergio Lafuente-Arroyo
Instituto Universitario de Investigación en Ciencias Policiales, Facultad de Derecho, Universidad de Alcalá, 28801, Alcalá de Henares, Madrid, Spain
Hilario Gómez-Moreno & Sergio Lafuente-Arroyo

Authors

Georgiana Bogdanel
View author publications
You can also search for this author in PubMed Google Scholar
Nadia Belghazi-Mohamed
View author publications
You can also search for this author in PubMed Google Scholar
Hilario Gómez-Moreno
View author publications
You can also search for this author in PubMed Google Scholar
Sergio Lafuente-Arroyo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hilario Gómez-Moreno .

Editor information

Editors and Affiliations

University of Malaga, Malaga, Spain
Cristina Alcaraz
University of Surrey, Guildford, UK
Liqun Chen
University of Kent, Canterbury, UK
Shujun Li
University of Milan, Milan, Italy
Pierangela Samarati

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bogdanel, G., Belghazi-Mohamed, N., Gómez-Moreno, H., Lafuente-Arroyo, S. (2022). Study on the Effect of Face Masks on Forensic Speaker Recognition. In: Alcaraz, C., Chen, L., Li, S., Samarati, P. (eds) Information and Communications Security. ICICS 2022. Lecture Notes in Computer Science, vol 13407. Springer, Cham. https://doi.org/10.1007/978-3-031-15777-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-031-15777-6_33
Published: 24 August 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15776-9
Online ISBN: 978-3-031-15777-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Study on the Effect of Face Masks on Forensic Speaker Recognition