skip to main content
10.1145/3463676.3485615acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
short-paper
Public Access

What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning

Authors Info & Claims
Published:15 November 2021Publication History

ABSTRACT

It is estimated that by the year 2024, the total number of systems equipped with voice assistant software will exceed 8.4 billion devices globally. While these devices provide convenience to consumers, they suffer from a myriad of security issues. This paper highlights the serious privacy threats exposed by information leakage in a smart assistant's encrypted network traffic metadata. To investigate this issue, we have collected a new dataset composed of dynamic and static commands posed to an Amazon Echo Dot using data collection and cleaning scripts we developed.

Furthermore, we propose the Smart Home Assistant Malicious Ensemble model (SHAME) as the new state-of-the-art Voice Command Fingerprinting classifier. When evaluated against several datasets, our attack correctly classifies encrypted voice commands with up to 99.81% accuracy on Google Home traffic and 95.2% accuracy on Amazon Echo Dot traffic. These findings show that security measures must be taken to stop internet service providers, nation-states, and network eavesdroppers from monitoring our intimate conversations.

Skip Supplemental Material Section

Supplemental Material

WPES21-fp52s.mp4

mp4

31.6 MB

References

  1. ACK-J. 2021. SHAME Model, Fingerprinting Smart Assistants (GitHub). https: //github.com/ACK-J/SHAME_Model_Fingerprinting_Smart_AssistantsGoogle ScholarGoogle Scholar
  2. Caio A.P. Burgardt Antônio J. Pinheiro, Jeandro de M. Bezerra and Divanilson R. Campelo. [n. d.]. Identifying IoT devices and events based on packet length from encrypted traffic, Computer Communications.Google ScholarGoogle Scholar
  3. Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. CoRR abs/1708.05044 (2017). arXiv:1708.05044 http: //arxiv.org/abs/1708.05044Google ScholarGoogle Scholar
  4. Sanjit Bhat, David Lu, Albert Kwon, and Srinivas Devadas. 2019. Var-CNN: A Data- Efficient Website Fingerprinting Attack Based on Deep Learning. Proceedings on Privacy Enhancing Technologies 2019, 4 (2019), 292--310.Google ScholarGoogle ScholarCross RefCross Ref
  5. Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden voice commands. In 25th USENIX Security Symposium (USENIX Security '16). 513--530.Google ScholarGoogle Scholar
  6. Batyr Charyyev and Mehmet Hadi Gunes. 2020. IoT Event Classification Based on Network Traffic. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE. https://doi.org/10.1109/ infocomwkshps50562.2020.9162885Google ScholarGoogle Scholar
  7. Chenggang Wang et al. 2020. DeepVC Alexa and Google Home Pcap Datasets. https://drive.google.com/drive/folders/1l- fSX9VdZH5kF9z7gm82xgYX5ca0kRI0?usp=sharingGoogle ScholarGoogle Scholar
  8. festvox. 2021. Flite: A small run-time speech synthesis engine (Github). https: //github.com/festvox/fliteGoogle ScholarGoogle Scholar
  9. Jamie Hayes and George Danezis. 2016. k-fingerprinting: A robust scalable website fingerprinting technique. In USENIX Security Symposium. USENIX Association, 1--17.Google ScholarGoogle Scholar
  10. Jack Hyland. 2021. Exploiting Amazon Alexa Using the SHAME Model POC Video. https://drive.google.com/file/d/1nMd7PYX6JGB4ESqGlNlwv0fnka9_ QMOH/view?usp=sharingGoogle ScholarGoogle Scholar
  11. Jack Hyland. 2021. SHAME Dynamic vs Static Dataset. https://drive.google.com/file/d/1K19SDZ3IdvAv_0rK6mG9d8WTpHg85gzV/view?usp=sharingGoogle ScholarGoogle Scholar
  12. Marc Juarez, Sadia Afroz, Gunes Acar, Claudia Diaz, and Rachel Greenstadt. 2014. A Critical Evaluation of Website Fingerprinting Attacks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (Scottsdale, Arizona, USA) (CCS '14). Association for Computing Machinery, New York, NY, USA, 263--274. https://doi.org/10.1145/2660267.2660368Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Sean Kennedy, Haipeng Li, Chenggang Wang, Hao Liu, Boyang Wang, and Wenhai Sun. 2019. I Can Hear Your Alexa: Voice Command Fingerprinting on Smart Home Speakers. In 2019 IEEE Conference on Communications and Network Security (CNS). IEEE. https://doi.org/10.1109/cns.2019.8802686Google ScholarGoogle ScholarCross RefCross Ref
  14. Marc Liberatore and Brian Levine. 2006. Inferring the source of encrypted HTTP connections. 255--263. https://doi.org/10.1145/1180405.1180437Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Yanyan Lit, Sara Kim, and Eric Sy. 2021. A Survey on Amazon Alexa Attack Surfaces. In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). IEEE, 1--7. https://doi.org/10.1109/ccnc49032.2021.9369553Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Jouni Malinen. 2021. hostapd: IEEE 802.11 AP, IEEE 802.1X WPA/WPA2/EAP/RADIUS Authenticator. https://w1.fi/hostapd/Google ScholarGoogle Scholar
  17. Mozilla. 2021. Project DeepSpeech (Github). https://github.com/mozilla/ DeepSpeechGoogle ScholarGoogle Scholar
  18. Andriy Panchenko, Fabian Lanze, Andreas Zinnen, Martin Henze, Jan Pennekamp, Klaus Wehrle, and Thomas Engel. 2016. Website Fingerprinting at Internet Scale. In Proceedings 2016 Network and Distributed System Security Symposium. Internet Society. https://doi.org/10.14722/ndss.2016.23477Google ScholarGoogle ScholarCross RefCross Ref
  19. Mohammad Saidur Rahman, Mohsen Imani, Nate Mathews, and Matthew Wright. 2020. Mockingbird: Defending against deep-learning-based website fingerprinting attacks with adversarial traces. IEEE Transactions on Information Forensics and Security 16 (2020), 1594--1609.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Vera Rimmer, Davy Preuveneers, Marc Juarez, Tom Van Goethem, and Wouter Joosen. 2018. Automated Website Fingerprinting through Deep Learning. In Network and Distributed System Security Symposium (NDSS). Internet Society.Google ScholarGoogle ScholarCross RefCross Ref
  21. Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Inaudible voice commands: The long-range attack and defense. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI '18). 547--560.Google ScholarGoogle Scholar
  22. Vitaly Shmatikov and Ming-Hsiu Wang. 2006. Timing Analysis in Low-Latency Mix Networks: Attacks and Defenses. In Computer Security -- ESORICS 2006. Springer Berlin Heidelberg, 18--33. https://doi.org/10.1007/11863908_2Google ScholarGoogle ScholarCross RefCross Ref
  23. Payap Sirinam, Mohsen Imani, Marc Juarez, and Matthew Wright. 2018. Deep Fingerprinting. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM. https://doi.org/10.1145/3243734.3243768Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Payap Sirinam, Nate Mathews, Mohammad Saidur Rahman, and Matthew Wright. 2019. Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-Shot Learning. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). Association for Computing Machinery, New York, NY, USA, 1131--1148. https: //doi.org/10.1145/3319535.3354217Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Arunan Sivanathan, Hassan Habibi Gharakheili, Franco Loi, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. 2019. Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics. IEEE Transactions on Mobile Computing 18, 8 (2019), 1745--1759. https://doi.org/10. 1109/TMC.2018.2866249Google ScholarGoogle ScholarCross RefCross Ref
  26. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 15, 1 (Jan. 2014), 1929--1958.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. 2015. Efficient Object Localization Using Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  28. Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. 2015. Cocaine noodles: exploiting the gap between human and machine speech recognition. In 9th USENIX Workshop on Offensive Technologies (WOOT '15).Google ScholarGoogle Scholar
  29. Lionel Sujay Vailshery. 2021. Number of digital voice assistants in use worldwide 2019--2024. https://www.statista.com/statistics/973815/worldwide-digital-voice-assistant-in-use/Google ScholarGoogle Scholar
  30. Chenggang Wang, Sean Kennedy, Haipeng Li, King Hudson, Gowtham Atluri, Xuetao Wei, Wenhai Sun, and Boyang Wang. 2020. Fingerprinting encrypted voice traffic on smart speakers with deep learning. In Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks. ACM. https://doi.org/10.1145/3395351.3399357 arXiv:2005.09800Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Qiuyu Xiao, Michael K. Reiter, and Yinqian Zhang. 2015. Mitigating Storage Side Channels Using Statistical Privacy Mechanisms. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (Denver, Colorado, USA) (CCS '15). ACM, New York, NY, USA, 1582--1594. https://doi.org/10.1145/ 2810103.2813645Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, XiaoFeng Wang, and Carl A Gunter. 2018. Com- mandersong: A systematic approach for practical adversarial voice recognition. In 27th USENIX Security Symposium (USENIX Security '18). 49--64.Google ScholarGoogle Scholar
  33. Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. 2019. Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 1381--1396.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic Society
      November 2021
      257 pages
      ISBN:9781450385275
      DOI:10.1145/3463676
      • General Chairs:
      • Yongdae Kim,
      • Jong Kim,
      • Program Chairs:
      • Giovanni Livraga,
      • Noseong Park

      Copyright © 2021 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 November 2021

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • short-paper

      Acceptance Rates

      Overall Acceptance Rate106of355submissions,30%

      Upcoming Conference

      CCS '24
      ACM SIGSAC Conference on Computer and Communications Security
      October 14 - 18, 2024
      Salt Lake City , UT , USA

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader