research-article

Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints

Authors:
John Lim

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Jan-Michael Frahm

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

,
Fabian Monrose

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
View Profile

CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and PrivacyApril 2022Pages 242–251https://doi.org/10.1145/3508398.3511498

Published:15 April 2022Publication History

CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy

Pages 242–251

ABSTRACT

Keystroke inference attacks are a form of side-channel attacks in which an attacker leverages various techniques to recover a user's keystrokes as she inputs information into some display (e.g., while sending a text message or entering her pin). Typically, these attacks leverage machine learning approaches, but assessing the realism of the threat space has lagged behind the pace of machine learning advancements, due in-part, to the challenges in curating large real-life datasets. We aim to overcome the challenge of having limited number of real data by introducing a video domain adaptation technique that is able to leverage synthetic data through supervised disentangled learning. Specifically, for a given domain, we decompose the observed data into two factors of variation: Style and Content. Doing so provides four learned representations: real-life style, synthetic style, real-life content and synthetic content. Then, we combine them into feature representations from all combinations of style-content pairings across domains, and train a model on these combined representations to classify the content (i.e., labels) of a given datapoint in the style of another domain. We evaluate our method on real-life data using a variety of metrics to quantify the amount of information an attacker is able to recover. We show that our method prevents our model from overfitting to a small real-life training set, indicating that our method is an effective form of data augmentation, thereby making keystroke inference attacks more practical.

Supplemental Material

CODASPY22-fp016.mp4

mp4

46.1 MB

Download

References

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. arXiv:1607.06450 [stat.ML]Google Scholar
M. Backes, T. Chen, M. Duermuth, H. P. A. Lensch, and M. Welk. 2009. Tempest in a Teapot: Compromising Reflections Revisited. In 2009 30th IEEE Symposium on Security and Privacy. 315--327.Google Scholar
M. Backes, M. Dürmuth, and D. Unruh. 2008. Compromising Reflections-or-How to Read LCD Monitors around the Corner. In 2008 IEEE Symposium on Security and Privacy (sp 2008). 158--169.Google ScholarDigital Library
Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65--72. https://www.aclweb.org/anthology/W05-0909Google Scholar
Liang Cai and Hao Chen. 2012. On the Practicality of Motion Based Keystroke Inference Attack. 273--290. https://doi.org/10.1007/978--3--642--30921--2_16Google Scholar
Yimin Chen, Tao Li, Rui Zhang, Yanchao Zhang, and Terri Hedgpeth. 2018. EyeTell: Video-Assisted Touchscreen Keystroke Inference from Eye Movements. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 144--160.Google ScholarCross Ref
Yang Chen, Yingwei Pan, Ting Yao, X. Tian, and T. Mei. 2019. Mocycle-GAN: Unpaired Video-to-Video Translation. Proceedings of the 27th ACM International Conference on Multimedia (2019).Google Scholar
Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, M. Enzweiler, Rodrigo Benenson, Uwe Franke, S. Roth, and B. Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 3213--3223.Google Scholar
Fred J. Damerau. 1964. A Technique for Computer Detection and Correction of Spelling Errors. Commun. ACM 7, 3 (March 1964), 171--176. https://doi.org/10. 1145/363958.363994Google ScholarDigital Library
Emily Denton and Vighnesh Birodkar. 2017. Unsupervised Learning of Disentangled Representations from Video. arXiv:1705.10915 [cs.LG]Google Scholar
Yaroslav Ganin and Victor Lempitsky. 2014. Unsupervised Domain Adaptation by Backpropagation. arXiv:1409.7495 [stat.ML]Google Scholar
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arXiv:1406.2661 [stat.ML]Google Scholar
Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, and Trevor Darrell. 2017. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. arXiv:1711.03213 [cs.CV]Google Scholar
Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, and Richard Socher. 2019. Augmented Cyclic Adversarial Learning for Low Resource Domain Adaptation. In International Conference on Learning Representations. https://openreview.net/ forum?id=B1G9doA9F7Google Scholar
Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, and Juan Carlos Niebles. 2018. Learning to Decompose and Disentangle Representations for Video Prediction. arXiv:1806.04166 [cs.LG]Google Scholar
Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, P. Wang, Y. Lin, and Ruigang Yang. 2018. The ApolloScape Dataset for Autonomous Driving. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2018), 1067--10676.Google Scholar
A. Jamal, Vinay P. Namboodiri, Dipti Deodhare, and K. Venkatesh. 2018. Deep Domain Adaptation in Action Space. In BMVC.Google Scholar
Rohit Kulkarni. 2018. A Million News Headlines. https://doi.org/10.7910/DVN/ SYBGZLGoogle Scholar
Alon Lavie. 2010. Evaluating the Output of Machine Translation Systems. (01 2010).Google Scholar
Yingzhen Li and Stephan Mandt. 2018. Disentangled Sequential Autoencoder. arXiv:1803.02991 [cs.LG]Google Scholar
John Lim, True Price, Fabian Monrose, and Jan-Michael Frahm. 2020. Revisiting the Threat Space for Vision-based Keystroke Inference Attacks. arXiv:2009.05796 [cs.CV]Google Scholar
Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74--81. https://www.aclweb.org/anthology/W04--1013Google Scholar
Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. 2019. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations. arXiv:1811.12359 [cs.LG]Google Scholar
L. V. D. Maaten and Geoffrey E. Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605.Google ScholarDigital Library
Saeid Motiian, Quinn Jones, Seyed Iranmanesh, and Gianfranco Doretto. 2017. Few-Shot Adversarial Domain Adaptation. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H.Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 6670--6680. http://papers.nips.cc/paper/7244-few-shot-adversarial-domain-adaptation.pdfGoogle Scholar
Boxiao Pan, Zhangjie Cao, E. Adeli, and Juan Carlos Niebles. 2020. Adversarial Cross-Domain Action Recognition with Co-Attention. ArXiv abs/1912.10405 (2020).Google Scholar
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311--318. https://doi.org/10.3115/1073083.1073135Google ScholarDigital Library
Rahul Raguram, Andrew M White, Dibyendusekhar Goswami, Fabian Monrose, and Jan-Michael Frahm. 2011. iSpy: automatic reconstruction of typed input from compromising reflections. In Proceedings of the 18th ACM conference on Computer and communications security. 527--536.Google ScholarDigital Library
A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, Justus Thies, and M. Nießner. 2018. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. ArXiv abs/1803.09179 (2018).Google Scholar
K. Schindler and L. Gool. 2008. Action snippets: How many frames does human action recognition require? 2008 IEEE Conference on Computer Vision and Pattern Recognition (2008), 1--8.Google ScholarCross Ref
Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In In Proceedings of Association for Machine Translation in the Americas. 223--231.Google Scholar
Jingchao Sun, Xiaocong Jin, Yimin Chen, Jinxue Zhang, Yanchao Zhang, and Rui Zhang. 2016. VISIBLE: Video-Assisted Keystroke Inference from Tablet Backside Motion.. In NDSS.Google Scholar
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arXiv:1409.3215 [cs.CL]Google ScholarDigital Library
Joshua B. Tenenbaum and William T. Freeman. 1997. Separating Style and Content. In Advances in Neural Information Processing Systems 9, M. C. Mozer, M. I. Jordan, and T. Petsche (Eds.). MIT Press, 662--668. http://papers.nips.cc/ paper/1290-separating-style-and-content.pdfGoogle Scholar
J. B. Tenenbaum and W. T. Freeman. 2000. Separating Style and Content with Bilinear Models. Neural Computation 12, 6 (2000), 1247--1283.Google ScholarDigital Library
Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ? ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H.Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008. http://papers.nips.cc/paper/7181- attention-is-all-you-need.pdfGoogle ScholarDigital Library
Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, and Bryan Catanzaro. 2019. Few-shot Video-to-Video Synthesis. In Conference on Neural Information Processing Systems (NeurIPS).Google Scholar
Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems (NeurIPS).Google Scholar
Jin woo Choi, Gaurav Sharma, S. Schulter, and J. Huang. 2020. Shuffle and Attend: Video Domain Adaptation. In ECCV.Google Scholar
Yi Xu, Jared Heinly, Andrew M White, Fabian Monrose, and Jan-Michael Frahm. 2013. Seeing double: Reconstructing obscured typed input from repeated compromising reflections. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. 1063--1074.Google ScholarDigital Library
Guixin Ye, Zhanyong Tang, Dingyi Fang, Xiaojiang Chen, Kwang In Kim, Ben Taylor, and Zheng Wang. 2017. Cracking Android pattern lock in five attempts. (2017).Google Scholar
Qinggang Yue, Zhen Ling, Xinwen Fu, Benyuan Liu, Kui Ren, and Wei Zhao. 2014. Blind recognition of touched keys on mobile devices. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1403--1414.Google ScholarDigital Library
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Computer Vision (ICCV), 2017 IEEE International Conference on.Google ScholarCross Ref

Index Terms

Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints
1. Security and privacy
  1. Software and application security
    1. Software security engineering

Recommendations

Revisiting the Threat Space for Vision-Based Keystroke Inference Attacks
Computer Vision – ECCV 2020 Workshops
Abstract
A vision-based keystroke inference attack is a side-channel attack in which an attacker uses an optical device to record users on their mobile devices and infer their keystrokes. The threat space for these attacks has been studied in the past, but ...
Read More
Distributed denial of service attacks and its defenses in IoT: a survey
Abstract
A distributed denial of service (DDoS) attack is an attempt to partially or completely shut down the targeted server with a flood of internet traffic. The primary aim of this attack is to disrupt regular traffic flow to the victim’s server or ...
Read More
FLEDGE: Ledger-based Federated Learning Resilient to Inference and Backdoor Attacks
ACSAC '23: Proceedings of the 39th Annual Computer Security Applications Conference

Federated learning (FL) is a distributed learning process that uses a trusted aggregation server to allow multiple parties (or clients) to collaboratively train a machine learning model without having them share their private data. Recent research, ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy
April 2022
392 pages
ISBN:9781450392204
DOI:10.1145/3508398
General Chair:
Anupam Joshi
University of Maryland, Baltimore County, USA
,
Program Chairs:
Maribel Fernandez
King's College London, UK
,
Rakesh M. Verma
University of Houston, USA
Copyright © 2022 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 15 April 2022
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
data leak detection and prevention
gaze detection
novel datasets
security and privacy
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate149of789submissions,19%
Upcoming Conference
CODASPY '24

Sponsor:

sigsac

Fourteenth ACM Conference on Data and Application Security and Privacy

June 19 - 21, 2024

Porto , Portugal
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 41
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints

CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Revisiting the Threat Space for Vision-Based Keystroke Inference Attacks

Distributed denial of service attacks and its defenses in IoT: a survey

FLEDGE: Ledger-based Federated Learning Resilient to Inference and Backdoor Attacks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints

CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

Revisiting the Threat Space for Vision-Based Keystroke Inference Attacks

Distributed denial of service attacks and its defenses in IoT: a survey

FLEDGE: Ledger-based Federated Learning Resilient to Inference and Backdoor Attacks

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media