short-paper

Public Access

What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning

Authors:
Jack Hyland

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

,
Conrad Schneggenburger

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

,
Nick Lim

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

,
Jake Ruud

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

,
Nate Mathews

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

,
Matthew Wright

Rochester Institute of Technology, Rochester, NY, USA

Rochester Institute of Technology, Rochester, NY, USA
View Profile

WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic SocietyNovember 2021Pages 237–243https://doi.org/10.1145/3463676.3485615

Published:15 November 2021Publication History

WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic Society

Pages 237–243

ABSTRACT

It is estimated that by the year 2024, the total number of systems equipped with voice assistant software will exceed 8.4 billion devices globally. While these devices provide convenience to consumers, they suffer from a myriad of security issues. This paper highlights the serious privacy threats exposed by information leakage in a smart assistant's encrypted network traffic metadata. To investigate this issue, we have collected a new dataset composed of dynamic and static commands posed to an Amazon Echo Dot using data collection and cleaning scripts we developed.

Furthermore, we propose the Smart Home Assistant Malicious Ensemble model (SHAME) as the new state-of-the-art Voice Command Fingerprinting classifier. When evaluated against several datasets, our attack correctly classifies encrypted voice commands with up to 99.81% accuracy on Google Home traffic and 95.2% accuracy on Amazon Echo Dot traffic. These findings show that security measures must be taken to stop internet service providers, nation-states, and network eavesdroppers from monitoring our intimate conversations.

Supplemental Material

WPES21-fp52s.mp4

mp4

31.6 MB

Download

References

ACK-J. 2021. SHAME Model, Fingerprinting Smart Assistants (GitHub). https: //github.com/ACK-J/SHAME_Model_Fingerprinting_Smart_AssistantsGoogle Scholar
Caio A.P. Burgardt Antônio J. Pinheiro, Jeandro de M. Bezerra and Divanilson R. Campelo. [n. d.]. Identifying IoT devices and events based on packet length from encrypted traffic, Computer Communications.Google Scholar
Noah Apthorpe, Dillon Reisman, Srikanth Sundaresan, Arvind Narayanan, and Nick Feamster. 2017. Spying on the Smart Home: Privacy Attacks and Defenses on Encrypted IoT Traffic. CoRR abs/1708.05044 (2017). arXiv:1708.05044 http: //arxiv.org/abs/1708.05044Google Scholar
Sanjit Bhat, David Lu, Albert Kwon, and Srinivas Devadas. 2019. Var-CNN: A Data- Efficient Website Fingerprinting Attack Based on Deep Learning. Proceedings on Privacy Enhancing Technologies 2019, 4 (2019), 292--310.Google ScholarCross Ref
Nicholas Carlini, Pratyush Mishra, Tavish Vaidya, Yuankai Zhang, Micah Sherr, Clay Shields, David Wagner, and Wenchao Zhou. 2016. Hidden voice commands. In 25th USENIX Security Symposium (USENIX Security '16). 513--530.Google Scholar
Batyr Charyyev and Mehmet Hadi Gunes. 2020. IoT Event Classification Based on Network Traffic. In IEEE INFOCOM 2020 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE. https://doi.org/10.1109/ infocomwkshps50562.2020.9162885Google Scholar
Chenggang Wang et al. 2020. DeepVC Alexa and Google Home Pcap Datasets. https://drive.google.com/drive/folders/1l- fSX9VdZH5kF9z7gm82xgYX5ca0kRI0?usp=sharingGoogle Scholar
festvox. 2021. Flite: A small run-time speech synthesis engine (Github). https: //github.com/festvox/fliteGoogle Scholar
Jamie Hayes and George Danezis. 2016. k-fingerprinting: A robust scalable website fingerprinting technique. In USENIX Security Symposium. USENIX Association, 1--17.Google Scholar
Jack Hyland. 2021. Exploiting Amazon Alexa Using the SHAME Model POC Video. https://drive.google.com/file/d/1nMd7PYX6JGB4ESqGlNlwv0fnka9_ QMOH/view?usp=sharingGoogle Scholar
Jack Hyland. 2021. SHAME Dynamic vs Static Dataset. https://drive.google.com/file/d/1K19SDZ3IdvAv_0rK6mG9d8WTpHg85gzV/view?usp=sharingGoogle Scholar
Marc Juarez, Sadia Afroz, Gunes Acar, Claudia Diaz, and Rachel Greenstadt. 2014. A Critical Evaluation of Website Fingerprinting Attacks. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (Scottsdale, Arizona, USA) (CCS '14). Association for Computing Machinery, New York, NY, USA, 263--274. https://doi.org/10.1145/2660267.2660368Google ScholarDigital Library
Sean Kennedy, Haipeng Li, Chenggang Wang, Hao Liu, Boyang Wang, and Wenhai Sun. 2019. I Can Hear Your Alexa: Voice Command Fingerprinting on Smart Home Speakers. In 2019 IEEE Conference on Communications and Network Security (CNS). IEEE. https://doi.org/10.1109/cns.2019.8802686Google ScholarCross Ref
Marc Liberatore and Brian Levine. 2006. Inferring the source of encrypted HTTP connections. 255--263. https://doi.org/10.1145/1180405.1180437Google ScholarDigital Library
Yanyan Lit, Sara Kim, and Eric Sy. 2021. A Survey on Amazon Alexa Attack Surfaces. In 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC). IEEE, 1--7. https://doi.org/10.1109/ccnc49032.2021.9369553Google ScholarDigital Library
Jouni Malinen. 2021. hostapd: IEEE 802.11 AP, IEEE 802.1X WPA/WPA2/EAP/RADIUS Authenticator. https://w1.fi/hostapd/Google Scholar
Mozilla. 2021. Project DeepSpeech (Github). https://github.com/mozilla/ DeepSpeechGoogle Scholar
Andriy Panchenko, Fabian Lanze, Andreas Zinnen, Martin Henze, Jan Pennekamp, Klaus Wehrle, and Thomas Engel. 2016. Website Fingerprinting at Internet Scale. In Proceedings 2016 Network and Distributed System Security Symposium. Internet Society. https://doi.org/10.14722/ndss.2016.23477Google ScholarCross Ref
Mohammad Saidur Rahman, Mohsen Imani, Nate Mathews, and Matthew Wright. 2020. Mockingbird: Defending against deep-learning-based website fingerprinting attacks with adversarial traces. IEEE Transactions on Information Forensics and Security 16 (2020), 1594--1609.Google ScholarDigital Library
Vera Rimmer, Davy Preuveneers, Marc Juarez, Tom Van Goethem, and Wouter Joosen. 2018. Automated Website Fingerprinting through Deep Learning. In Network and Distributed System Security Symposium (NDSS). Internet Society.Google ScholarCross Ref
Nirupam Roy, Sheng Shen, Haitham Hassanieh, and Romit Roy Choudhury. 2018. Inaudible voice commands: The long-range attack and defense. In 15th USENIX Symposium on Networked Systems Design and Implementation (NSDI '18). 547--560.Google Scholar
Vitaly Shmatikov and Ming-Hsiu Wang. 2006. Timing Analysis in Low-Latency Mix Networks: Attacks and Defenses. In Computer Security -- ESORICS 2006. Springer Berlin Heidelberg, 18--33. https://doi.org/10.1007/11863908_2Google ScholarCross Ref
Payap Sirinam, Mohsen Imani, Marc Juarez, and Matthew Wright. 2018. Deep Fingerprinting. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. ACM. https://doi.org/10.1145/3243734.3243768Google ScholarDigital Library
Payap Sirinam, Nate Mathews, Mohammad Saidur Rahman, and Matthew Wright. 2019. Triplet Fingerprinting: More Practical and Portable Website Fingerprinting with N-Shot Learning. In Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security (London, United Kingdom) (CCS '19). Association for Computing Machinery, New York, NY, USA, 1131--1148. https: //doi.org/10.1145/3319535.3354217Google ScholarDigital Library
Arunan Sivanathan, Hassan Habibi Gharakheili, Franco Loi, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. 2019. Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics. IEEE Transactions on Mobile Computing 18, 8 (2019), 1745--1759. https://doi.org/10. 1109/TMC.2018.2866249Google ScholarCross Ref
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A Simple Way to Prevent Neural Networks from Overfitting. J. Mach. Learn. Res. 15, 1 (Jan. 2014), 1929--1958.Google ScholarDigital Library
Jonathan Tompson, Ross Goroshin, Arjun Jain, Yann LeCun, and Christoph Bregler. 2015. Efficient Object Localization Using Convolutional Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Tavish Vaidya, Yuankai Zhang, Micah Sherr, and Clay Shields. 2015. Cocaine noodles: exploiting the gap between human and machine speech recognition. In 9th USENIX Workshop on Offensive Technologies (WOOT '15).Google Scholar
Lionel Sujay Vailshery. 2021. Number of digital voice assistants in use worldwide 2019--2024. https://www.statista.com/statistics/973815/worldwide-digital-voice-assistant-in-use/Google Scholar
Chenggang Wang, Sean Kennedy, Haipeng Li, King Hudson, Gowtham Atluri, Xuetao Wei, Wenhai Sun, and Boyang Wang. 2020. Fingerprinting encrypted voice traffic on smart speakers with deep learning. In Proceedings of the 13th ACM Conference on Security and Privacy in Wireless and Mobile Networks. ACM. https://doi.org/10.1145/3395351.3399357 arXiv:2005.09800Google ScholarDigital Library
Qiuyu Xiao, Michael K. Reiter, and Yinqian Zhang. 2015. Mitigating Storage Side Channels Using Statistical Privacy Mechanisms. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security (Denver, Colorado, USA) (CCS '15). ACM, New York, NY, USA, 1582--1594. https://doi.org/10.1145/ 2810103.2813645Google ScholarDigital Library
Xuejing Yuan, Yuxuan Chen, Yue Zhao, Yunhui Long, Xiaokang Liu, Kai Chen, Shengzhi Zhang, Heqing Huang, XiaoFeng Wang, and Carl A Gunter. 2018. Com- mandersong: A systematic approach for practical adversarial voice recognition. In 27th USENIX Security Symposium (USENIX Security '18). 49--64.Google Scholar
Nan Zhang, Xianghang Mi, Xuan Feng, XiaoFeng Wang, Yuan Tian, and Feng Qian. 2019. Dangerous skills: Understanding and mitigating security risks of voice-controlled third-party functions on virtual personal assistant systems. In 2019 IEEE Symposium on Security and Privacy (SP). IEEE, 1381--1396.Google ScholarCross Ref

Index Terms

What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning
1. Security and privacy

Recommendations

IoT Based Smart Assistant for Blind Person and Smart Home Using the Bengali Language
Abstract
Bengali is the seventh most spoken language, and 300 million people around the world speak in Bengali. The smart home is responsible for the digitization of modern society. This research reflects an Internet of Things (IoT) based smart solution, ...
Read More
Accommodating smart meeting rooms with a context-aware smart assistant
ICIC'10: Proceedings of the Advanced intelligent computing theories and applications, and 6th international conference on Intelligent computing

While lots of efforts have been made to build smart meeting rooms, assisting meeting organizations right before and/or after a meeting, receives less attention. Communicating with a multitude of meeting participants is usually a tedious and time-...
Read More
Voice Command Fingerprinting with Locality Sensitive Hashes
CPSIOTSEC'20: Proceedings of the 2020 Joint Workshop on CPS&IoT Security and Privacy

Smart home speakers are deployed in millions of homes around the world. These speakers enable users to interact with other IoT devices in the household and provide voice assistance such as telling the weather and reminding appointments. Although smart ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic Society
November 2021
257 pages
ISBN:9781450385275
DOI:10.1145/3463676
General Chairs:
Yongdae Kim
Korea Advanced Institute of Science & Technology, Republic of Korea)
,
Jong Kim
Pohang University of Science and Technology, Republic of Korea
,
Program Chairs:
Giovanni Livraga
Università degli Studi di Milano, Italy
,
Noseong Park
Yonsei University, Republic of Korea
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 November 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
deep learning
smart assistant
traffic fingerprinting
voice command
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate106of355submissions,30%
Upcoming Conference
CCS '24

Sponsor:

sigsac

ACM SIGSAC Conference on Computer and Communications Security

October 14 - 18, 2024

Salt Lake City , UT , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 268
  Total Downloads
- Downloads (Last 12 months)106
- Downloads (Last 6 weeks)12
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning

WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic Society

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

IoT Based Smart Assistant for Blind Person and Smart Home Using the Bengali Language

Accommodating smart meeting rooms with a context-aware smart assistant

Voice Command Fingerprinting with Locality Sensitive Hashes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

What a SHAME: Smart Assistant Voice Command Fingerprinting Utilizing Deep Learning

WPES '21: Proceedings of the 20th Workshop on Workshop on Privacy in the Electronic Society

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

IoT Based Smart Assistant for Blind Person and Smart Home Using the Bengali Language

Accommodating smart meeting rooms with a context-aware smart assistant

Voice Command Fingerprinting with Locality Sensitive Hashes

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media