Abstract
Speakers can conceal their identity by deliberately changing their speech characteristics, or disguising their voices. During voice disguise, speakers alter their normal movements of the articulators, such as tongue positions, according to a predetermined strategy. Even though technology for accurate articulatory measurements has existed for years, few studies have investigated articulation during voice disguise. In this pilot study, we recorded articulation of four speakers during regular and disguised speech using electromagnetic articulography. We analyzed imitation of foreign accents as a voice disguise strategy and utilized functional t-tests as a novel method for revealing articulatory differences between regular and disguised speech. In addition, we evaluated discovered articulatory differences in the light of the performance of an x-vector-based automatic speaker verification system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
VoxCeleb Xvector models system 1a. https://kaldi-asr.org/models/m7. Accessed 10 April 2021
Arnold, D., Tomaschek, F.: The karl eberhards corpus of spontaneously spoken southern german in dialogues-audio and articulatory recordings. In: Kleber, C.D.F. (ed.) Tagungsband der 12. tagung phonetik und phonologie im deutschsprachigen raum, pp. 9–11. Ludwig-Maximilians-Universitat Munchen. Retriev (2016)
Boersma, P., Weenink, D.: Praat: doing phonetics by computer [computer program] (2020). https://praat.org
Canevari, C., Badino, L., Fadiga, L.: A new italian dataset of parallel acoustic and articulatory data. In: Sixteenth Annual Conference of the International Speech Communication Association (2015)
Fan, J., Yongbing, L.: The impact of l1 negative phonological transfer on l2 word identification and production. Int. J. Linguist. 6(5), 37–50 (2014)
González Hautamäki, R., Hautamäki, V., Kinnunen, T.: On the limits of automatic speaker verification: explaining degraded recognizer scores through acoustic changes resulting from voice disguise. J. Acoust. Soc. Am. 146(1), 693–704 (2019)
Hansen, J.H., Bořil, H.: On the issues of intra-speaker variability and realism in speech, speaker, and language recognition tasks. Speech Commun. 101, 94–108 (2018)
Ji, A., Berry, J.J., Johnson, M.T.: The electromagnetic articulography mandarin accented english (ema-mae) corpus of acoustic and 3d articulatory kinematic data. In: 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 7719–7723. IEEE (2014)
Kisler, T., Reichel, U., Schiel, F.: Multilingual processing of speech via web services. Comput. Speech Lang. 45, 326–347 (2017)
Malmi, A., Lippus, P.: Keele asend eesti palatalisatsioonis. J. Est. Finno-Ugric Linguist. 10(1), 105–128 (2019)
Nagrani, A., Chung, J.S., Xie, W., Zisserman, A.: Voxceleb: large-scale speaker verification in the wild. Computer Science and Language, p. 101027 (2019)
Narayanan, S., et al.: Real-time magnetic resonance imaging and electromagnetic articulography database for speech production research (tc). J. Acoust. Soc. Am. 136(3), 1307–1311 (2014)
Neuhauser, S.: Voice disguise using a foreign accent: phonetic and linguistic variation. Int. J. Speech Lang. Law 15(2), 131–159 (2008)
Povey, D., et al.: The kaldi speech recognition toolkit. In: IEEE 2011 Workshop on Automatic Speech Recognition And Understanding (ASRU). IEEE Signal Processing Society, Hawaii, US (2011)
Prince, S.J.D., Elder, J.H.: Probabilistic linear discriminant analysis for inferences about identity. In: Proceedings of ICCV, pp. 1–8. Rio de Janeiro, Brazil (2007). https://doi.org/10.1109/ICCV.2007.4409052
R Core Team: R: A language and environment for statistical computing (2020). https://www.R-project.org/
Ramsay, J., Graves, S., Hooker, G.: fda: Functional data analysis. R package version 5.1.5.1. (2020). https://CRAN.R-project.org/package=fda
Ramsay, J.O., Silverman, B.W.: Functional data analysis (2nd edition). Springer Verlag, NY (2005)
Richmond, K., Hoole, P., King, S.: Announcing the electromagnetic articulography (day 1) subset of the mngu0 articulatory corpus. In: Twelfth Annual Conference of the International Speech Communication Association (2011)
Schötz, S., Frid, J., Gustafsson, L., Löfqvist, A.: Functional data analysis of tongue articulation in palatal vowels: Gothenburg and malmöhus swedish/i: y: 0ff. In: Proceedings of Interspeech, vol. 2013 (2013)
de Silva, V., Ullakonoja, R.: Introduction: russian and finnish in contact. In: de Silva, V., Ullakonoja, R. (eds.) Phonetic of Russian and Finnish: General Description of Phonetic Systems: Experimental Studies on Spontaneous and Read-aloud Speech, pp. 15–20. Peter Lang, Frankfurt a. M. (2009)
Snyder, D., Garcia-Romero, D., Povey, D., Khudanpur, S.: Deep neural network embeddings for text-independent speaker verification. In: Proceedings of INTERSPEECH, pp. 999–1003. Stockholm, Sweden (2017)
Snyder, D., Garcia-Romero, D., Sell, G., Povey, D., Khudanpur, S.: X-vectors: robust DNN embeddings for speaker recognition. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 5329–5333. IEEE, Calgary, AB, Canada (2018)
Wrench, A.: The mocha-timit articulatory database (1999). www.cstr.ed.ac.uk/research/projects/artic/mocha.html
Acknowledgments
This project was partly funded by Academy of Finland (project 309629). Einar Meister’s work was supported by the European Regional Development Foundation (the project “Centre of Excellence in Estonian Studies”). We thank Fabian Tomaschek for providing a set of R scripts for post processing of raw EMA data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Tavi, L., Kinnunen, T., Meister, E., González-Hautamäki, R., Malmi, A. (2021). Articulation During Voice Disguise: A Pilot Study. In: Karpov, A., Potapova, R. (eds) Speech and Computer. SPECOM 2021. Lecture Notes in Computer Science(), vol 12997. Springer, Cham. https://doi.org/10.1007/978-3-030-87802-3_61
Download citation
DOI: https://doi.org/10.1007/978-3-030-87802-3_61
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87801-6
Online ISBN: 978-3-030-87802-3
eBook Packages: Computer ScienceComputer Science (R0)