Skip to main content

Cyber Sentinel: Fortifying Voice Assistant Security with Biometric Template Integration in Neural Networks

  • Conference paper
  • First Online:
Wireless Artificial Intelligent Computing Systems and Applications (WASA 2024)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14997))

  • 278 Accesses

Abstract

With the increasing prevalence of voice assistant services (VAS), ensuring system security and user privacy has become a significant challenge. Preliminary analysis of existing authentication mechanisms reveals shortcomings, particularly in multi-user settings and the reliance on additional devices. To address this, we propose a novel approach that embeds users’ biometric templates into the neural network model of voice assistants for identity authentication. Leveraging the robust sound processing capabilities of CNNs, this method employs watermark technology within the model for user identity verification. Experimental results demonstrate that this method effectively verifies user identities while the impact on the original model’s performance can be negligible. Evaluation continuation with 10 participants and 300 different voice commands revealed an overall accuracy of 99.01% and an equal error rate (EER) of 1.25%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Anand, S.A., Liu, J., Wang, C., Shirvanian, M., Saxena, N., Chen, Y.: EchoVib: exploring voice authentication via unique non-linear vibrations of short replayed speech. In: Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security, pp. 67–81 (2021)

    Google Scholar 

  2. Campbell, J.P.: Speaker recognition: a tutorial. Proc. IEEE 85(9), 1437–1462 (1997)

    Article  Google Scholar 

  3. Chang, Y.T., Dupuis, M.J.: My voiceprint is my authenticator: a two-layer authentication approach using voiceprint for voice assistants. In: 2019 IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computing, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation (SmartWorld/SCALCOM/UIC/ATC/CBDCom/IOP/SCI), pp. 1318–1325. IEEE (2019)

    Google Scholar 

  4. Chen, H., Rohani, B.D., Koushanfar, F.: DeepMarks: A digital fingerprinting framework for deep neural networks (2018). arXiv preprint arXiv:1804.03648

  5. Darvish Rouhani, B., Chen, H., Koushanfar, F.: DeepSigns: an end-to-end watermarking framework for ownership protection of deep neural networks. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pp. 485–497 (2019)

    Google Scholar 

  6. De Leon, P.L., Pucher, M., Yamagishi, J., Hernaez, I., Saratxaga, I.: Evaluation of speaker verification security and detection of hmm-based synthetic speech. IEEE Trans. Audio Speech Lang. Process. 20(8), 2280–2290 (2012)

    Article  Google Scholar 

  7. El-Moneim, S.A., et al.: Text-dependent and text-independent speaker recognition of reverberant speech based on CNN. Int. J. Speech Technol. 24(4), 993–1006 (2021)

    Article  Google Scholar 

  8. Fan, L., Ng, K.W., Chan, C.S.: Rethinking deep neural network ownership verification: embedding passports to defeat ambiguity attacks. In: Advances in Neural Information Processing Systems, vol. 32 (2019)

    Google Scholar 

  9. Feng, H., Fawaz, K., Shin, K.G.: Continuous authentication for voice assistants. In: Proceedings of the 23rd Annual International Conference on Mobile Computing and Networking, pp. 343–355 (2017)

    Google Scholar 

  10. Lao, Y., Zhao, W., Yang, P., Li, P.: DeepAuth: A DNN authentication framework by model-unique and fragile signature embedding. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 36, pp. 9595–9603 (2022)

    Google Scholar 

  11. Lindberg, J., Blomberg, M.: Vulnerability in speaker verification-a study of technical impostor techniques. In: Sixth European Conference on Speech Communication and Technology (1999)

    Google Scholar 

  12. Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3(1), 72–83 (1995)

    Article  Google Scholar 

  13. Terzopoulos, G., Satratzemi, M.: Voice assistants and smart speakers in everyday life and in education. Inf. Educ. 19(3), 473–490 (2020)

    Google Scholar 

  14. Togneri, R., Pullella, D.: An overview of speaker identification: accuracy and robustness issues. IEEE Circuits Syst. Mag. 11(2), 23–61 (2011)

    Article  Google Scholar 

  15. Uchida, Y., Nagai, Y., Sakazawa, S., Satoh, S.: Embedding watermarks into deep neural networks. In: Proceedings of the 2017 ACM on International Conference on Multimedia Retrieval, pp. 269–277 (2017)

    Google Scholar 

  16. Variani, E., Lei, X., McDermott, E., Moreno, I.L., Gonzalez-Dominguez, J.: Deep neural networks for small footprint text-dependent speaker verification. In: 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), pp. 4052–4056. IEEE (2014)

    Google Scholar 

  17. Wang, C., Shi, C., Chen, Y., Wang, Y., Saxena, N.: WearID: Wearable-assisted low-effort authentication to voice assistants using cross-domain speech similarity (2020). arXiv preprint arXiv:2003.09083

  18. Yan, C., Ji, X., Wang, K., Jiang, Q., Jin, Z., Xu, W.: A survey on voice assistant security: attacks and countermeasures. ACM Comput. Surv. 55(4), 1–36 (2022)

    Article  Google Scholar 

  19. Zhao, X., Yao, Y., Wu, H., Zhang, X.: Structural watermarking to deep neural networks via network channel pruning. In: 2021 IEEE International Workshop on Information Forensics and Security (WIFS), pp. 1–6. IEEE (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yao Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Cao, P., Wang, Y., Si, Z., Lyu, P., Zhang, H. (2025). Cyber Sentinel: Fortifying Voice Assistant Security with Biometric Template Integration in Neural Networks. In: Cai, Z., Takabi, D., Guo, S., Zou, Y. (eds) Wireless Artificial Intelligent Computing Systems and Applications. WASA 2024. Lecture Notes in Computer Science, vol 14997. Springer, Cham. https://doi.org/10.1007/978-3-031-71464-1_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-71464-1_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-71463-4

  • Online ISBN: 978-3-031-71464-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics