skip to main content
10.1145/3508398.3511498acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
research-article

Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints

Authors Info & Claims
Published:15 April 2022Publication History

ABSTRACT

Keystroke inference attacks are a form of side-channel attacks in which an attacker leverages various techniques to recover a user's keystrokes as she inputs information into some display (e.g., while sending a text message or entering her pin). Typically, these attacks leverage machine learning approaches, but assessing the realism of the threat space has lagged behind the pace of machine learning advancements, due in-part, to the challenges in curating large real-life datasets. We aim to overcome the challenge of having limited number of real data by introducing a video domain adaptation technique that is able to leverage synthetic data through supervised disentangled learning. Specifically, for a given domain, we decompose the observed data into two factors of variation: Style and Content. Doing so provides four learned representations: real-life style, synthetic style, real-life content and synthetic content. Then, we combine them into feature representations from all combinations of style-content pairings across domains, and train a model on these combined representations to classify the content (i.e., labels) of a given datapoint in the style of another domain. We evaluate our method on real-life data using a variety of metrics to quantify the amount of information an attacker is able to recover. We show that our method prevents our model from overfitting to a small real-life training set, indicating that our method is an effective form of data augmentation, thereby making keystroke inference attacks more practical.

Skip Supplemental Material Section

Supplemental Material

CODASPY22-fp016.mp4

mp4

46.1 MB

References

  1. Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer Normalization. arXiv:1607.06450 [stat.ML]Google ScholarGoogle Scholar
  2. M. Backes, T. Chen, M. Duermuth, H. P. A. Lensch, and M. Welk. 2009. Tempest in a Teapot: Compromising Reflections Revisited. In 2009 30th IEEE Symposium on Security and Privacy. 315--327.Google ScholarGoogle Scholar
  3. M. Backes, M. Dürmuth, and D. Unruh. 2008. Compromising Reflections-or-How to Read LCD Monitors around the Corner. In 2008 IEEE Symposium on Security and Privacy (sp 2008). 158--169.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Satanjeev Banerjee and Alon Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments. In Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, 65--72. https://www.aclweb.org/anthology/W05-0909Google ScholarGoogle Scholar
  5. Liang Cai and Hao Chen. 2012. On the Practicality of Motion Based Keystroke Inference Attack. 273--290. https://doi.org/10.1007/978--3--642--30921--2_16Google ScholarGoogle Scholar
  6. Yimin Chen, Tao Li, Rui Zhang, Yanchao Zhang, and Terri Hedgpeth. 2018. EyeTell: Video-Assisted Touchscreen Keystroke Inference from Eye Movements. In 2018 IEEE Symposium on Security and Privacy (SP). IEEE, 144--160.Google ScholarGoogle ScholarCross RefCross Ref
  7. Yang Chen, Yingwei Pan, Ting Yao, X. Tian, and T. Mei. 2019. Mocycle-GAN: Unpaired Video-to-Video Translation. Proceedings of the 27th ACM International Conference on Multimedia (2019).Google ScholarGoogle Scholar
  8. Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, M. Enzweiler, Rodrigo Benenson, Uwe Franke, S. Roth, and B. Schiele. 2016. The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 3213--3223.Google ScholarGoogle Scholar
  9. Fred J. Damerau. 1964. A Technique for Computer Detection and Correction of Spelling Errors. Commun. ACM 7, 3 (March 1964), 171--176. https://doi.org/10. 1145/363958.363994Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. Emily Denton and Vighnesh Birodkar. 2017. Unsupervised Learning of Disentangled Representations from Video. arXiv:1705.10915 [cs.LG]Google ScholarGoogle Scholar
  11. Yaroslav Ganin and Victor Lempitsky. 2014. Unsupervised Domain Adaptation by Backpropagation. arXiv:1409.7495 [stat.ML]Google ScholarGoogle Scholar
  12. Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde- Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative Adversarial Networks. arXiv:1406.2661 [stat.ML]Google ScholarGoogle Scholar
  13. Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, and Trevor Darrell. 2017. CyCADA: Cycle-Consistent Adversarial Domain Adaptation. arXiv:1711.03213 [cs.CV]Google ScholarGoogle Scholar
  14. Ehsan Hosseini-Asl, Yingbo Zhou, Caiming Xiong, and Richard Socher. 2019. Augmented Cyclic Adversarial Learning for Low Resource Domain Adaptation. In International Conference on Learning Representations. https://openreview.net/ forum?id=B1G9doA9F7Google ScholarGoogle Scholar
  15. Jun-Ting Hsieh, Bingbin Liu, De-An Huang, Li Fei-Fei, and Juan Carlos Niebles. 2018. Learning to Decompose and Disentangle Representations for Video Prediction. arXiv:1806.04166 [cs.LG]Google ScholarGoogle Scholar
  16. Xinyu Huang, Xinjing Cheng, Qichuan Geng, Binbin Cao, Dingfu Zhou, P. Wang, Y. Lin, and Ruigang Yang. 2018. The ApolloScape Dataset for Autonomous Driving. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2018), 1067--10676.Google ScholarGoogle Scholar
  17. A. Jamal, Vinay P. Namboodiri, Dipti Deodhare, and K. Venkatesh. 2018. Deep Domain Adaptation in Action Space. In BMVC.Google ScholarGoogle Scholar
  18. Rohit Kulkarni. 2018. A Million News Headlines. https://doi.org/10.7910/DVN/ SYBGZLGoogle ScholarGoogle Scholar
  19. Alon Lavie. 2010. Evaluating the Output of Machine Translation Systems. (01 2010).Google ScholarGoogle Scholar
  20. Yingzhen Li and Stephan Mandt. 2018. Disentangled Sequential Autoencoder. arXiv:1803.02991 [cs.LG]Google ScholarGoogle Scholar
  21. John Lim, True Price, Fabian Monrose, and Jan-Michael Frahm. 2020. Revisiting the Threat Space for Vision-based Keystroke Inference Attacks. arXiv:2009.05796 [cs.CV]Google ScholarGoogle Scholar
  22. Chin-Yew Lin. 2004. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out. Association for Computational Linguistics, Barcelona, Spain, 74--81. https://www.aclweb.org/anthology/W04--1013Google ScholarGoogle Scholar
  23. Francesco Locatello, Stefan Bauer, Mario Lucic, Gunnar Rätsch, Sylvain Gelly, Bernhard Schölkopf, and Olivier Bachem. 2019. Challenging Common Assumptions in the Unsupervised Learning of Disentangled Representations. arXiv:1811.12359 [cs.LG]Google ScholarGoogle Scholar
  24. L. V. D. Maaten and Geoffrey E. Hinton. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9 (2008), 2579--2605.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Saeid Motiian, Quinn Jones, Seyed Iranmanesh, and Gianfranco Doretto. 2017. Few-Shot Adversarial Domain Adaptation. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H.Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 6670--6680. http://papers.nips.cc/paper/7244-few-shot-adversarial-domain-adaptation.pdfGoogle ScholarGoogle Scholar
  26. Boxiao Pan, Zhangjie Cao, E. Adeli, and Juan Carlos Niebles. 2020. Adversarial Cross-Domain Action Recognition with Co-Attention. ArXiv abs/1912.10405 (2020).Google ScholarGoogle Scholar
  27. Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, 311--318. https://doi.org/10.3115/1073083.1073135Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Rahul Raguram, Andrew M White, Dibyendusekhar Goswami, Fabian Monrose, and Jan-Michael Frahm. 2011. iSpy: automatic reconstruction of typed input from compromising reflections. In Proceedings of the 18th ACM conference on Computer and communications security. 527--536.Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. A. Rössler, D. Cozzolino, L. Verdoliva, C. Riess, Justus Thies, and M. Nießner. 2018. FaceForensics: A Large-scale Video Dataset for Forgery Detection in Human Faces. ArXiv abs/1803.09179 (2018).Google ScholarGoogle Scholar
  30. K. Schindler and L. Gool. 2008. Action snippets: How many frames does human action recognition require? 2008 IEEE Conference on Computer Vision and Pattern Recognition (2008), 1--8.Google ScholarGoogle ScholarCross RefCross Ref
  31. Matthew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla, and John Makhoul. 2006. A study of translation edit rate with targeted human annotation. In In Proceedings of Association for Machine Translation in the Americas. 223--231.Google ScholarGoogle Scholar
  32. Jingchao Sun, Xiaocong Jin, Yimin Chen, Jinxue Zhang, Yanchao Zhang, and Rui Zhang. 2016. VISIBLE: Video-Assisted Keystroke Inference from Tablet Backside Motion.. In NDSS.Google ScholarGoogle Scholar
  33. Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to Sequence Learning with Neural Networks. arXiv:1409.3215 [cs.CL]Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Joshua B. Tenenbaum and William T. Freeman. 1997. Separating Style and Content. In Advances in Neural Information Processing Systems 9, M. C. Mozer, M. I. Jordan, and T. Petsche (Eds.). MIT Press, 662--668. http://papers.nips.cc/ paper/1290-separating-style-and-content.pdfGoogle ScholarGoogle Scholar
  35. J. B. Tenenbaum and W. T. Freeman. 2000. Separating Style and Content with Bilinear Models. Neural Computation 12, 6 (2000), 1247--1283.Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. 2017. Adversarial Discriminative Domain Adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google ScholarGoogle ScholarCross RefCross Ref
  37. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, ? ukasz Kaiser, and Illia Polosukhin. 2017. Attention is All you Need. In Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H.Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.). Curran Associates, Inc., 5998--6008. http://papers.nips.cc/paper/7181- attention-is-all-you-need.pdfGoogle ScholarGoogle ScholarDigital LibraryDigital Library
  38. Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Jan Kautz, and Bryan Catanzaro. 2019. Few-shot Video-to-Video Synthesis. In Conference on Neural Information Processing Systems (NeurIPS).Google ScholarGoogle Scholar
  39. Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, and Bryan Catanzaro. 2018. Video-to-Video Synthesis. In Advances in Neural Information Processing Systems (NeurIPS).Google ScholarGoogle Scholar
  40. Jin woo Choi, Gaurav Sharma, S. Schulter, and J. Huang. 2020. Shuffle and Attend: Video Domain Adaptation. In ECCV.Google ScholarGoogle Scholar
  41. Yi Xu, Jared Heinly, Andrew M White, Fabian Monrose, and Jan-Michael Frahm. 2013. Seeing double: Reconstructing obscured typed input from repeated compromising reflections. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security. 1063--1074.Google ScholarGoogle ScholarDigital LibraryDigital Library
  42. Guixin Ye, Zhanyong Tang, Dingyi Fang, Xiaojiang Chen, Kwang In Kim, Ben Taylor, and Zheng Wang. 2017. Cracking Android pattern lock in five attempts. (2017).Google ScholarGoogle Scholar
  43. Qinggang Yue, Zhen Ling, Xinwen Fu, Benyuan Liu, Kui Ren, and Wei Zhao. 2014. Blind recognition of touched keys on mobile devices. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1403--1414.Google ScholarGoogle ScholarDigital LibraryDigital Library
  44. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Computer Vision (ICCV), 2017 IEEE International Conference on.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. Leveraging Disentangled Representations to Improve Vision-Based Keystroke Inference Attacks Under Low Data Constraints

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy
      April 2022
      392 pages
      ISBN:9781450392204
      DOI:10.1145/3508398

      Copyright © 2022 ACM

      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 15 April 2022

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • research-article

      Acceptance Rates

      Overall Acceptance Rate149of789submissions,19%

      Upcoming Conference

      CODASPY '24
    • Article Metrics

      • Downloads (Last 12 months)11
      • Downloads (Last 6 weeks)1

      Other Metrics

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader