Cyber Sentinel: Fortifying Voice Assistant Security with Biometric Template Integration in Neural Networks

Cao, Ping; Wang, Yao; Si, Zhipeng; Lyu, Pin; Zhang, Haibin

doi:10.1007/978-3-031-71464-1_16

Ping Cao¹¹,
Yao Wang¹¹,
Zhipeng Si¹¹,
Pin Lyu¹² &
…
Haibin Zhang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14997))

Included in the following conference series:

International Conference on Wireless Artificial Intelligent Computing Systems and Applications

278 Accesses

Abstract

With the increasing prevalence of voice assistant services (VAS), ensuring system security and user privacy has become a significant challenge. Preliminary analysis of existing authentication mechanisms reveals shortcomings, particularly in multi-user settings and the reliance on additional devices. To address this, we propose a novel approach that embeds users’ biometric templates into the neural network model of voice assistants for identity authentication. Leveraging the robust sound processing capabilities of CNNs, this method employs watermark technology within the model for user identity verification. Experimental results demonstrate that this method effectively verifies user identities while the impact on the original model’s performance can be negligible. Evaluation continuation with 10 participants and 300 different voice commands revealed an overall accuracy of 99.01% and an equal error rate (EER) of 1.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

MelSpectroNet: Enhancing Voice Authentication Security with AI-Based Siamese Model and Noise Reduction for Seamless User Experience

Resilience of Voice Assistants to Synthetic Speech

All Your Voices are Belong to Us: Stealing Voices to Fool Humans and Machines

References

Anand, S.A., Liu, J., Wang, C., Shirvanian, M., Saxena, N., Chen, Y.: EchoVib: exploring voice authentication via unique non-linear vibrations of short replayed speech. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 67–81 (2021)
Google Scholar
Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)
Article Google Scholar
Chang, Y.T., Dupuis, M.J.: My voiceprint is my authenticator: a two-layer authentication approach using voiceprint for voice assistants. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1318–1325. IEEE (2019)
Google Scholar
Chen, H., Rohani, B.D., Koushanfar, F.: DeepMarks: A digital fingerprinting framework for deep neural networks (2018). arXiv preprint arXiv:1804.03648
Darvish Rouhani, B., Chen, H., Koushanfar, F.: DeepSigns: an end-to-end watermarking framework for ownership protection of deep neural networks. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 485–497 (2019)
Google Scholar
De Leon, P.L., Pucher, M., Yamagishi, J., Hernaez, I., Saratxaga, I.: Evaluation of speaker verification security and detection of hmm-based synthetic speech. IEEE Trans. Audio Speech Lang. Process. 20(8), 2280–2290 (2012)
Article Google Scholar
El-Moneim, S.A., et al.: Text-dependent and text-independent speaker recognition of reverberant speech based on CNN. Int. J. Speech Technol. 24(4), 993–1006 (2021)
Article Google Scholar
Fan, L., Ng, K.W., Chan, C.S.: Rethinking deep neural network ownership verification: embedding passports to defeat ambiguity attacks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Feng, H., Fawaz, K., Shin, K.G.: Continuous authentication for voice assistants. In: Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, pp. 343–355 (2017)
Google Scholar
Lao, Y., Zhao, W., Yang, P., Li, P.: DeepAuth: A DNN authentication framework by model-unique and fragile signature embedding. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 9595–9603 (2022)
Google Scholar
Lindberg, J., Blomberg, M.: Vulnerability in speaker verification-a study of technical impostor techniques. In: Sixth European Conference on Speech Communication and Technology (1999)
Google Scholar
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)
Article Google Scholar
Terzopoulos, G., Satratzemi, M.: Voice assistants and smart speakers in everyday life and in education. Inf. Educ. 19(3), 473–490 (2020)
Google Scholar
Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11(2), 23–61 (2011)
Article Google Scholar
Uchida, Y., Nagai, Y., Sakazawa, S., Satoh, S.: Embedding watermarks into deep neural networks. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 269–277 (2017)
Google Scholar
Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4052–4056. IEEE (2014)
Google Scholar
Wang, C., Shi, C., Chen, Y., Wang, Y., Saxena, N.: WearID: Wearable-assisted low-effort authentication to voice assistants using cross-domain speech similarity (2020). arXiv preprint arXiv:2003.09083
Yan, C., Ji, X., Wang, K., Jiang, Q., Jin, Z., Xu, W.: A survey on voice assistant security: attacks and countermeasures. ACM Comput. Surv. 55(4), 1–36 (2022)
Article Google Scholar
Zhao, X., Yao, Y., Wu, H., Zhang, X.: Structural watermarking to deep neural networks via network channel pruning. In: 2021 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6. IEEE (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Xidian University, Xi’an, 710071, Shaanxi, China
Ping Cao, Yao Wang, Zhipeng Si & Haibin Zhang
Northwestern Polytechnical University, Xi’an, 710072, Shaanxi, China
Pin Lyu

Authors

Ping Cao
View author publications
You can also search for this author in PubMed Google Scholar
Yao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhipeng Si
View author publications
You can also search for this author in PubMed Google Scholar
Pin Lyu
View author publications
You can also search for this author in PubMed Google Scholar
Haibin Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yao Wang .

Editor information

Editors and Affiliations

Georgia State University, Atlanta, GA, USA
Zhipeng Cai
Old Dominion University, Norfolk, VA, USA
Daniel Takabi
Beijing University of Posts and Telecommunications, Beijing, China
Shaoyong Guo
Shandong University, Qingdao, China
Yifei Zou

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cao, P., Wang, Y., Si, Z., Lyu, P., Zhang, H. (2025). Cyber Sentinel: Fortifying Voice Assistant Security with Biometric Template Integration in Neural Networks. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14997. Springer, Cham. https://doi.org/10.1007/978-3-031-71464-1_16

Download citation

DOI: https://doi.org/10.1007/978-3-031-71464-1_16
Published: 13 November 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-71463-4
Online ISBN: 978-3-031-71464-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Cyber Sentinel: Fortifying Voice Assistant Security with Biometric Template Integration in Neural Networks