research-article

Public Access

WearID: Low-Effort Wearable-Assisted Authentication of Voice Commands via Cross-Domain Comparison without Training

Authors:

Chen WangAuthors Info & Claims

ACSAC '20: Proceedings of the 36th Annual Computer Security Applications Conference

Pages 829 - 842

https://doi.org/10.1145/3427228.3427259

Published: 08 December 2020 Publication History

All formats PDF

Abstract

Due to the open nature of voice input, voice assistant (VA) systems (e.g., Google Home and Amazon Alexa) are vulnerable to various security and privacy leakages (e.g., credit card numbers, passwords), especially when issuing critical user commands involving large purchases, critical calls, etc. Though the existing VA systems may employ voice features to identify users, they are still vulnerable to various acoustic-based attacks (e.g., impersonation, replay, and hidden command attacks). In this work, we propose a training-free voice authentication system, WearID, leveraging the cross-domain speech similarity between the audio domain and the vibration domain to provide enhanced security to the ever-growing deployment of VA systems. In particular, when a user gives a critical command, WearID exploits motion sensors on the user’s wearable device to capture the aerial speech in the vibration domain and verify it with the speech captured in the audio domain via the VA device’s microphone. Compared to existing approaches, our solution is low-effort and privacy-preserving, as it neither requires users’ active inputs (e.g., replying messages/calls) nor to store users’ privacy-sensitive voice samples for training. In addition, our solution exploits the distinct vibration sensing interface and its short sensing range to sound (e.g., 25cm) to verify voice commands. Examining the similarity of the two domains’ data is not trivial. The huge sampling rate gap (e.g., 8000Hz vs. 200Hz) between the audio and vibration domains makes it hard to compare the two domains’ data directly, and even tiny data noises could be magnified and cause authentication failures. To address the challenges, we investigate the complex relationship between the two sensing domains and develop a spectrogram-based algorithm to convert the microphone data into the lower-frequency “ motion sensor data” to facilitate cross-domain comparisons. We further develop a user authentication scheme to verify that the received voice command originates from the legitimate user based on the cross-domain speech similarity of the received voice commands. We report on extensive experiments to evaluate the WearID under various audible and inaudible attacks. The results show WearID can verify voice commands with 99.8% accuracy in the normal situation and detect 97.2% fake voice commands from various attacks, including impersonation/replay attacks and hidden voice/ultrasound attacks.

References

[1]

2015. Wearable ID: Is it a fit for your campus?https://www.cr80news.com/news-item/wearable-id-is-it-a-fit-for-your-campus/.

Abstract

References

Cited By

Index Terms

Recommendations

Defeating hidden audio channel attacks on voice assistants via audio-induced surface vibrations

Inaudible voice commands: the long-range attack and defense

Excerpt of ToothSonic: Earable Authentication via Acoustic Toothprint

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Funding Sources

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

HTML Format

Login options

Full Access

Share

Share this Publication link

Share on social media

Affiliations