Abstract
In recent years, automatic speaker verification (ASV) has been widely used in speech biometrics. The ASV systems are vulnerable to various spoofing attacks, such as synthesized speech (SS), voice conversion (VC), replay attacks, twin attacks, and simulation attacks. The research to ensure the application of voice biometric systems in various security fields has attracted more and more researchers’ interest. The combination of credibility and voice is particularly important in this period when biometric systems are widely used. We propose a novel secure speech biometric protection method. This article also summarizes previous research on spoofing attacks, focusing on SS, VC, and replay, as well as recent efforts to improve security and develop countermeasures for spoofing speech detection (SSD) tasks. At the same time, it pointed out the limitations and challenges of SSD tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Gao, Y., Iqbal, S., et al.: Performance and power analysis of high-density multi-GPGPU architectures: a preliminary case study. In: IEEE 17th HPCC (2015)
Qiu, M., Khisamutdinov, E., et al.: RNA nanotechnology for computer design and in vivo computation. Philos. Trans. R. Soc. A 371, 20120310 (2013)
Qiu, M., Cao, D., et al.: Data transfer minimization for financial derivative pricing using Monte Carlo simulation with GPU in 5G. J. Commun. Sys. 29(16), 2364–2374 (2016)
Zhao, H., Chen, M., et al.: A novel pre-cache schema for high performance Android system. Futur. Gener. Comput. Syst. 56, 766–772 (2016)
Zhang, Z., Wu, J., et al.: Jamming ACK attack to wireless networks and a mitigation approach. In: IEEE GLOBECOM Conference, pp. 1–5 (2008)
Qiu, M., Chen, Z., Liu, M.: Low-power low-latency data allocation for hybrid scratch-pad memory. IEEE Embed. Syst. Lett. 6(4), 69–72 (2014)
Lu, R., Jin, X., Zhang, S., Qiu, M., Wu, X.: A study on big knowledge and its engineering issues. IEEE Trans. Knowl. Data Eng. 31(9), 1630–1644 (2018)
Lu, Z., Wang, N., et al.: IoTDeM: An IoT Big Data-oriented MapReduce performance prediction extended model in multiple edge clouds. JPDC 118, 316–327 (2018)
Thakur, K., Qiu, M., Gai, K., Ali, M.: An investigation on cyber security threats and security models. In: IEEE CSCloud (2015)
Gai, K., Qiu, M., Sun, X., Zhao, H.: Security and privacy issues: a survey on FinTech. SmartCom, pp. 236–247 (2016)
Gai, K., Qiu, M., Elnagdy, S.: A novel secure big data cyber incident analytics framework for cloud-based cybersecurity insurance. IEEE BigDataSecurity (2016)
Qiu, H., Qiu, M., Memmi, G., Ming, Z., Liu, M.: A dynamic scalable blockchain based communication architecture for IoT. SmartBlock, pp. 159–166 (2018)
ISO/IEC 30107–1: Information technology-Biometric presentation attack detection-part1: framework (2016). https://www.iso.org/obp/ui/#iso:std:iso-iec:30107:-1:ed-1:v1:en
Muckenhirn, H., Magimai-Doss, M., Marcel, S.: Presentation attack detection using long-term spectral statistics for trustworthy speaker verification. In: IEEE International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, pp. 1–6 (2016)
Hunt, A.J., Black, A.W.: Unit selection in a concatenative speech synthesis system using a large speech database. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, Atlanta, Georgia, USA, pp. 373–376 (1996)
Zen, H., Tokuda, K., Black, A.W.: Statistical parametric speech synthesis. Speech Commun. 51(11), 1039–1064 (2009)
Qian, Y., Soong, F.K., Yan, Z.J.: A unified trajectory tiling approach to high quality speech rendering. IEEE Trans. Audio Speech Lang. Proc.21(2), 280–290 (2013)
Saito, Y., Takamichi, S., Saruwatari, H.: Statistical parametric speech synthesis incorporating generative adversarial networks. IEEE/ACM Trans. Audio Speech Lang. Process. 26(1), 84–96 (2018)
Wang, Y., Skerry-Ryan, R.J., Stanton, D., Wu, Y., Saurous, R.A.: Tacotron: towards end-to-end speech synthesis. arXiv preprint. arXiv 1703.10135 (2017). https://arxiv.org/abs/1703.10135. Accessed 17 Oct 2018
van den Oord, A., et al.: WaveNet: a generative model for raw audio. In: ISCA Speech Synthesis Workshop (SSW), Sunnyvale, California, USA, pp. 1–15 (2016)
Stylianou, Y., Cappé, O., Moulines, E.: Continuous probabilistic transform for voice conversion. IEEE Trans. Audio Speech Lang. Process. 6(2), 131–142 (1998)
Eun-Kyoung, Kim, Yung-Hwan, Oh: Hidden Markov model based voice conversion system using dynamic characteristics of speaker. European Conference on Speech Communication and Technology, Rhodes, Greece, 1–4, 1997
Sundermann D., Hoge H., Bonafonte A., Ney H., Black A., Narayanan S.: Text-independent voice conversion based on unit selection. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Toulouse, France, pp. I–81–I–84 (2006)
Wilde, M., Martinez, A.: Probabilistic principal component analysis applied to voice conversion. In: IEEE Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, vol. 2, pp. 2255–2259 (2004)
Zhang, S., Huang, D., Lei, X., Chng, E.S., Li, H., Dong, M.: Non-negative matrix factorization using stable alternating direction method of multipliers for source separation. In: IEEE APSIPA, pp. 222–228 (2015)
Desai, S., Raghavendra, E.V., Yegnanarayana, B., Black, A.W., Prahallad, K.: Voice conversion using artificial neural networks. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Taipei, Taiwan, pp. 3893–3896 (2009)
Sahidullah Md, Kinnunen, T., Hanilçi, C.: A comparison of features for synthetic speech detection. In: Interspeech (2015)
Todisco M, HĂ©ctor Delgado, Evans N.: A new feature for automatic speaker verification anti-spoofing: constant Q Cepstral coefficients. In: Odyssey 2016 - The Speaker and Language Recognition Workshop (2016)
Suthokumar, G., Sethu, V., Wijenayake, C.: Modulation dynamic features for the detection of replay attacks. In: Interspeech 2018 (2018)
Shim, H., Jung, J., Heo, H., Yoon, S., Yu, H.: Replay attack spoofing detection system using replay noise by multi-task learning (2018)
Gaina, L., Sergey, N., Egor, M., Alexander, K., Shchemelinin, V.: Audio replay attack detection with deep learning frameworks. Interspeech 2017 (2017)
Chen, Z., Zhang, W., Xie, Z., Xu, X., Chen, D.: Recurrent neural networks for automatic replay spoofing attack detection. In: 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2052–2056 (2018)
Acknowledgments
This work was supported by the 2020 Education Research Programs for Young Scholars in Fujian Province (JAT200815).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Yang, L., Wu, Z., Guo, J. (2022). A Novel Secure Speech Biometric Protection Method. In: Qiu, M., Gai, K., Qiu, H. (eds) Smart Computing and Communication. SmartCom 2021. Lecture Notes in Computer Science, vol 13202. Springer, Cham. https://doi.org/10.1007/978-3-030-97774-0_41
Download citation
DOI: https://doi.org/10.1007/978-3-030-97774-0_41
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97773-3
Online ISBN: 978-3-030-97774-0
eBook Packages: Computer ScienceComputer Science (R0)