research-article

RF-Mic: Live Voice Eavesdropping via Capturing Subtle Facial Speech Dynamics Leveraging RFID

Authors:
Yunzhong Chen

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0002-2389-4188
View Profile

,
Jiadi Yu

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0002-0207-9643
View Profile

,
Linghe Kong

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0001-9266-3044
View Profile

,
Hao Kong

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0002-0871-9795
View Profile

,
Yanmin Zhu

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0001-6406-4992
View Profile

,
Yi-Chao Chen

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

Shanghai Jiao Tong University, Department of Computer Science and Engineering, Shanghai, China

0000-0003-0782-4953
View Profile

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 7 Issue 2Article No.: 49pp 1–25https://doi.org/10.1145/3596259

Published:12 June 2023Publication History

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Abstract

Eavesdropping on human voice is one of the most common but harmful threats to personal privacy. Glasses are in direct contact with human face, which could sense facial motions when users speak, so human speech contents could be inferred by sensing the movements of glasses. In this paper, we present a live voice eavesdropping method, RF-Mic, which utilizes common glasses attached with a low-cost RFID tag to sense subtle facial speech dynamics for inferring possible voice contents. When a user with a glasses, which is attached an RFID tag on the glass bridge, is speaking, RF-Mic first collects RF signals through forward propagation and backscattering. Then, body motion interference is eliminated from the collected RF signals through a proposed Conditional Denoising AutoEncoder (CDAE) network. Next, RF-Mic extracts three kinds of facial speech dynamic features (i.e., facial movements, bone-borne vibrations, and airborne vibrations) by designing three different deep-learning models. Based on the extracted features, a facial speech dynamics model is constructed for live voice eavesdropping. Extensive experiments in different real environments demonstrate that RF-Mic can achieve robust and accurate human live voice eavesdropping.

References

S. Abhishek Anand and Nitesh Saxena. 2018. Speechless: Analyzing the Threat to Speech Privacy from Smartphone Motion Sensors. In Proc. IEEE Symposium on Security and Privacy. San Francisco, USA, 1000--1017.Google Scholar
Zhongjie Ba, Tianhang Zheng, Xinyu Zhang, Zhan Qin, Baochun Li, Xue Liu, and Kui Ren. 2020. Learning-based Practical Smartphone Eavesdropping with Built-in Accelerometer. In proc. NDSS. San Diego, USA, 23--26.Google Scholar
C. BYU. 2020. Word frequency: based on 450 million word coca corpus. [Online]. Available: https://www.wordfrequency.info/.Google Scholar
Zhe Chen, Tianyue Zheng, Chao Cai, and Jun Luo. 2021. MoVi-Fi: motion-robust vital signs waveform recovery via deep interpreted RF sensing. In Proc. ACM Mobicom. New Orleans, USA, 392--405.Google ScholarDigital Library
M Dobhn Daniel et al. 2008. The rf in rfid passive uhf rfid in practice. In Elsevier.Google Scholar
Abe Davis, Michae Rubinstein, Nea Wadhwa, Gautham J. Mysore, Fredo Durand, and William T. Freeman. 2014. The visual microphone: Passive recovery of sound from video. Acm Transactions on Graphics 33 (2014), 79--88.Google ScholarDigital Library
Han Ding, Longfei Shangguan, Zheng Yang, Jinsong Han, Zimu Zhou, Panlong Yang, Wei Xi, and Jizhong Zhao. 2015. FEMO: A Platform for Free-weight Exercise Monitoring with RFIDs. In Proc SenSys. Seoul, South Korea, 141--154.Google ScholarDigital Library
Pierre Divenyi, Steven Greenberg, and Georg Meyer. 2006. Dynamics of speech production and perception. Vol. 374. Ios Press.Google Scholar
Chao Feng, Jie Xiong, Liqiong Chang, Fuwei Wang, Ju Wang, and Dingyi Fang. 2021. RF-Identity: Non-Intrusive Person Identification Based on Commodity RFID Devices. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. (2021), 1--23.Google ScholarDigital Library
Yuanhao Feng, Panlong Yang, Yanyong Zhang, Xiang-Yang Li, Ziyang Chen, and Gang Huang. 2019. Demo: The RFID Can Hear Your Music Play. In Proc. MobiCom. Los Cabos, Mexico, 21--25.Google ScholarDigital Library
Google. 2023. Google Assistant, your own personal Google. [Online]. Available: https://assistant.google.com/.Google Scholar
Pengfei Hu, Wenhao Li, Yifan Ma, Panneer Selvam Santhalingam, Parth Pathak, Hong Li, Huanle Zhang, Guoming Zhang, Xiuzhen Cheng, and Prasant Mohapatra. 2022. Towards Unconstrained Vocabulary Eavesdropping With Mmwave Radar Using GAN. IEEE Transactions on Mobile Computing 01 (2022), 1--14.Google Scholar
Pengfei Hu, Yifan Ma, Panneer Selvam Santhalingam, Parth H Pathak, and Xiuzhen Cheng. 2022. Milliear: Millimeter-wave acoustic eavesdropping with unconstrained vocabulary. In Proc. INFOCOM. Virtual Conference, 11--20.Google ScholarDigital Library
Pengfei Hu, Hui Zhuang, Panneer Selvam Santhalingam, Riccardo Spolaor, Parth Pathak, Guoming Zhang, and Xiuzhen Cheng. 2022. AccEar: Accelerometer Acoustic Eavesdropping with Unconstrained Vocabulary. In Proc. IEEE Symposium on Security and Privacy (SP). San Francisco, CA, USA, 1530--1530.Google ScholarCross Ref
iflytek. 2022. iFlytek Input. [Online]. Available: https://srf.xunfei.cn/.Google Scholar
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).Google Scholar
Martin G Larson. 2006. Descriptive statistics and graphical displays. Circulation 114, 1 (2006), 76--81.Google ScholarCross Ref
Mike Lenehan. 2021. Impinj, Inc. Application Note -- Low Level User Data Support. [Online]. Available: https://support.impinj.com/hc/en-us/articles/202755318-Application-Note-Low-Level-User-Data-Support.Google Scholar
Ping Li, Zhenlin An, Lei Yang, and Panlong Yang. 2019. Towards Physical-Layer Vibration Sensing with RFIDs. In Proc. INFOCOM. Paris, France, 892--900.Google ScholarDigital Library
Ping Li, Zhenlin An, Lei Yang, Panlong Yang, and QiongZheng Lin. 2019. RFID harmonic for vibration sensing. IEEE Transactions on Mobile Computing 20, 4 (2019), 1614--1626.Google ScholarCross Ref
Héctor A. Cordourier Maruri, Paulo Lopez-Meyer, Jonathan Huang, Willem Marco Beltman, Lama Nachman, and Hong Lu. 2018. V-Speech: Noise-Robust Speech Capturing Glasses Using Vibration Sensors. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 4 (2018), 180:1--180:23.Google Scholar
Yan Michalevsky, Dan Boneh, and Gabi Nakibly. 2014. Gyrophone: Recognizing Speech from Gyroscope Signals. In Proc. USENIX. San Diego, CA,USA, 1053--1067.Google Scholar
F. Mavromatis N. Kargas and A. Bletsas. 2019. USRP reader. [Online]. Available: https://github.com/nkargas/Gen2-UHF-RFID-Reader.Google Scholar
Ben Nassi, Yaron Pirutin, Adi Shamir, Yuval Elovici, and Boris Zadov. 2020. Lamphone: Real-Time Passive Sound Recovery from Light Bulb Vibrations. Cryptology ePrint Archive, Paper 2020/708.Google Scholar
Louis C.W. Pols. 2011. SPEECH DYNAMICS. In Plenary Lecture.Google Scholar
Richard Raspet, Jeremy Webster, and Kevin Dillion. 2006. Framework for wind noise studies. The Journal of the Acoustical Society of America 119, 2 (2006), 834--843.Google ScholarCross Ref
rfidhy. 2022. The Smallest RFID Tag as Thin as Sand. [Online]. Available: https://www.rfidhy.com/the-smallest-rfid-tag-as-thin-as-sand/.Google Scholar
Sriram Sami, Yimin Dai, Sean Rui Xiang Tan, Nirupam Roy, and Jun Han. 2020. Spying with Your Robot Vacuum Cleaner: Eavesdropping via Lidar Sensors. In Proc. SenSys. Yokohama, Japan, 354--367.Google ScholarDigital Library
Baoguang Shi, Xiang Bai, and Cong Yao. 2016. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition. IEEE transactions on pattern analysis and machine intelligence 39, 11 (2016), 2298--2304.Google ScholarDigital Library
Cong Shi, Xiangyu Xu, Tianfang Zhang, Payton Walker, Yi Wu, Jian Liu, Nitesh Saxena, Yingying Chen, and Jiadi Yu. 2021. Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors. In Proc. MobiCom. New Orleans, United States, 478--490.Google ScholarDigital Library
Weigao Su, Daibo Liu, Taiyuan Zhang, and Hongbo Jiang. 2021. Towards Device Independent Eavesdropping on Telephone Conversations with Built-in Accelerometer. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 4 (2021), 177:1--177:29.Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).Google Scholar
Chuyu Wang and Lei Xie. 2018. Rf-ecg: Heart rate variability assessment based on cots rfid tag array. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 2, 2 (2018), 1--26.Google ScholarDigital Library
Chuyu Wang, Lei Xie, Yuancan Lin, Wei Wang, and Yingying Chen et al. 2021. Thru-the-wall Eavesdropping on Loudspeakers via RFID by Capturing Sub-mm Level Vibration. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol. 5, 4 (2021), 182:1--182:25.Google Scholar
DeLiang Wang. 2005. On ideal binary mask as the computational goal of auditory scene analysis. In Speech separation by humans and machines. Springer, 181--197.Google Scholar
Guanhua Wang, Yongpan Zou, Zimu Zhou, Kaishun Wu, and Lionel M Ni. 2016. We can hear you with Wi-Fi! IEEE Transactions on Mobile Computing 15, 11 (2016), 2907--2920.Google Scholar
Zi Wang, Yili Ren, Yingying Chen, and Jie Yang. 2022. Toothsonic: Earable authentication via acoustic toothprint. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 6, 2 (2022), 1--24.Google ScholarDigital Library
Teng Wei, Shu Wang, Anfu Zhou, and Xinyu Zhang. 2015. Acoustic Eavesdropping through Wireless Vibrometry. In Proc. MobiCom. Paris, France, 130--141.Google ScholarDigital Library
Zhichen Wu, Jianda Li, Jiadi Yu, Yanmin Zhu, Guangtao Xue, and Minglu Li. 2016. L3: Sensing driving conditions for vehicle lane-level localization on highways. In Proc. IEEE INFOCOM. San Francisco, CA, USA, 1--9.Google ScholarDigital Library
Fu Xiao, Zhongqin Wang, Ning Ye, Ruchuan Wang, and Xiang-Yang Li. 2017. One more tag enables fine-grained RFID localization and tracking. IEEE/ACM Transactions on Networking 26, 1 (2017), 161--174.Google ScholarDigital Library
Binbin Xie, Jie Xiong, Xiaojiang Chen, and Dingyi Fang. 2020. Exploring commodity rfid for contactless sub-millimeter vibration sensing. In Proc. ACM Sensys. Yokohama, Japan, 15--27.Google ScholarDigital Library
Chenhan Xu, Zhengxiong Li, Hanbin Zhang, Aditya Singh Rathore, Huining Li, Chen Song, Kun Wang, and Wenyao Xu. 2019. WaveEar: Exploring a mmWave-based Noise-resistant Speech Sensing for Voice-User Interface. In Proc. MobiSys. Seoul, Korea, 14--26.Google ScholarDigital Library
Xiangyu Xu, Jiadi Yu, Yingying Chen, Yanmin Zhu, Shiyou Qian, and Minglu Li. 2017. Leveraging audio signals for early recognition of inattentive driving with smartphones. IEEE Transactions on Mobile Computing 17, 7 (2017), 1553--1567.Google ScholarCross Ref
Lei Yang, Yao Li, Qiongzheng Lin, Huanyu Jia, Xiang-Yang Li, and Yunhao Liu. 2017. Tagbeat: Sensing mechanical vibration period with cots rfid systems. IEEE/ACM transactions on networking 25, 6 (2017), 3823--3835.Google Scholar
Lei Yang, Yao Li, Qiongzheng Lin, Xiang-Yang Li, and Yunhao Liu. 2016. Making sense of mechanical vibration period with sub-millisecond accuracy using backscatter signals. In Proc. MobiCom. New York City, NY, USA, 16--28.Google ScholarDigital Library
Panlong Yang, Yuanhao Feng, Jie Xiong, Ziyang Chen, and Xiang-Yang Li. 2020. RF-Ear: Contactless Multi-device Vibration Sensing and Identification Using COTS RFID. In Proc. INFOCOM. Toronto, ON, Canada, 297--306.Google ScholarDigital Library
Cheng Zhang, Qiuyue Xue, Anandghan Waghmare, Sumeet Jain, Yiming Pu, Sinan Hersek, Kent Lyons, Kenneth A Cunefare, Omer T Inan, and Gregory D Abowd. 2017. Soundtrak: Continuous 3d tracking of a finger using active acoustics. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 1, 2 (2017), 1--25.Google ScholarDigital Library
Li Zhang, Parth H. Pathak, Muchen Wu, Yixin Zhao, and Prasant Mohapatra. 2015. AccelWord: Energy Efficient Hotword Detection through Accelerometer. In Proc. MobiSys. Florence, Italy, 301--315.Google ScholarDigital Library
Minghang Zhao, Shisheng Zhong, Xuyun Fu, Baoping Tang, and Michael Pecht. 2019. Deep residual shrinkage networks for fault diagnosis. IEEE Transactions on Industrial Informatics 16, 7 (2019), 4681--4690.Google ScholarCross Ref
Yanmin Zhu, Ruobing Jiang, Jiadi Yu, Zhi Li, and Minglu Li. 2014. Geographic routing based on predictive locations in vehicular ad hoc networks. EURASIP Journal on Wireless Communications and Networking 2014 (2014), 1--9.Google ScholarCross Ref

Index Terms

RF-Mic: Live Voice Eavesdropping via Capturing Subtle Facial Speech Dynamics Leveraging RFID
1. Human-centered computing
  1. Human computer interaction (HCI)
2. Security and privacy
  1. Network security
    1. Mobile and wireless security

Recommendations

Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors
MobiCom '21: Proceedings of the 27th Annual International Conference on Mobile Computing and Networking

Augmented reality/virtual reality (AR/VR) has extended beyond 3D immersive gaming to a broader array of applications, such as shopping, tourism, education. And recently there has been a large shift from handheld-controller dominated interactions to ...
Read More
A smile can reveal your age: enabling facial dynamics in age estimation
MM '12: Proceedings of the 20th ACM international conference on Multimedia

Estimation of a person's age from the facial image has many applications, ranging from biometrics and access control to cosmetics and entertainment. Many image-based methods have been proposed for this problem. In this paper, we propose a method for the ...
Read More
Combining appearance and motion for face and gender recognition from videos

While many works consider moving faces only as collections of frames and apply still image-based methods, recent developments indicate that excellent results can be obtained using texture-based spatiotemporal representations for describing and analyzing ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies Volume 7, Issue 2
June 2023
969 pages
EISSN:2474-9567
DOI:10.1145/3604631
Issue’s Table of Contents

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 12 June 2023
Published in imwut Volume 7, Issue 2

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
RFID
facial dynamics
glasses
voice eavesdropping
Qualifiers
- research-article
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 369
  Total Downloads
- Downloads (Last 12 months)369
- Downloads (Last 6 weeks)22
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

RF-Mic: Live Voice Eavesdropping via Capturing Subtle Facial Speech Dynamics Leveraging RFID

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Abstract

References

Cited By

Index Terms

Recommendations

Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors

A smile can reveal your age: enabling facial dynamics in age estimation

Combining appearance and motion for face and gender recognition from videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

RF-Mic: Live Voice Eavesdropping via Capturing Subtle Facial Speech Dynamics Leveraging RFID

Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies

Abstract

References

Cited By

Index Terms

Recommendations

Face-Mic: inferring live speech and speaker identity via subtle facial dynamics captured by AR/VR motion sensors

A smile can reveal your age: enabling facial dynamics in age estimation

Combining appearance and motion for face and gender recognition from videos

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media