research-article

Vessels: efficient and scalable deep learning prediction on trusted processors

Authors:

Chung Hwan Kim,

Junghwan "John" Rhee,

Dave (Jing) Tian,

Byoungyoung LeeAuthors Info & Claims

SoCC '20: Proceedings of the 11th ACM Symposium on Cloud Computing

Pages 462 - 476

https://doi.org/10.1145/3419111.3421282

Published: 12 October 2020 Publication History

Abstract

Deep learning systems on the cloud are increasingly targeted by attacks that attempt to steal sensitive data. Intel SGX has been proven effective to protect the confidentiality and integrity of such data during computation. However, state-of-the-art SGX systems still suffer from substantial performance overhead induced by the limited physical memory of SGX. This limitation significantly undermines the usability of deep learning systems due to their memory-intensive characteristics.

In this paper, we provide a systematic study on the inefficiency of the existing SGX systems for deep learning prediction with a focus on their memory usage. Our study has revealed two causes of the inefficiency in the current memory usage paradigm: large memory allocation and low memory reusability. Based on this insight, we present Vessels, a new system that addresses the inefficiency and overcomes the limitation on SGX memory through memory usage optimization techniques. Vessels identifies the memory allocation and usage patterns of a deep learning program through model analysis and creates a trusted execution environment with an optimized memory pool, which minimizes the memory footprint with high memory reusability. Our experiments demonstrate that, by significantly reducing the memory foot-print and carefully scheduling the workloads, Vessels can achieve highly efficient and scalable deep learning prediction while providing strong data confidentiality and integrity with SGX.

Supplementary Material

MP4 File (p462-kim-presentation.mp4)

Download
183.40 MB

References

[1]

2017. Deep Learning Model Converters. https://github.com/ysh329/deep-learning-model-convertor.

[2]

2017. TF Trusted. https://github.com/dropoutlabs/tf-trusted.

[3]

2017. Top 5 Cloud Security related Data Breaches! https://www.cybersecurity-insiders.com/top-5--cloud-security-related-data-breaches/.

[4]

2018. Asylo: An open and flexible framework for enclave applications. https://asylo.dev/.

[5]

2019. Deep Learning on AWS. https://aws.amazon.com/deep-learning/.

[6]

2019. Deep Learning VM | Google Cloud. https://cloud.google.com/deep-learning-vm/.

[7]

2019. Human Error Often the Culprit in Cloud Data Breaches. https://www.wsj.com/articles/human-error-often-the-culprit-in-cloud-data-breaches-11566898203.

[8]

2019. Machine Learning Service | Microsoft Azure. https://azure.microsoft.com/en-us/services/machine-learning-service/.

[9]

2019. TensorFlow Lite. https://www.tensorflow.org/lite.

[10]

2020. AWS SageMaker Neo. https://aws.amazon.com/sagemaker/neo/.

[11]

2020. IBM Cloud Data Shield. https://www.ibm.com/cloud/data-shield.

[12]

2020. MS Azure Confidential Computing. https://azure.microsoft.com/en-us/solutions/confidential-compute/.

[13]

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16). 265--283.

[14]

Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan O'Keeffe, Mark L. Stillwell, David Goltzsche, Dave Eyers, Rüdiger Kapitza, Peter Pietzuch, and Christof Fetzer. 2016. SCONE: Secure Linux Containers with Intel SGX. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI '16). Savannah, GA, 689--703.

Digital Library

[15]

Somnath Chakrabarti, Matthew Hoekstra, Dmitrii Kuvaiskii, and Mona Vij. 2019. Scaling Intel Software Guard Extensions Applications with Intel SGX Card. In Proceedings of the 8th International Workshop on Hardware and Architectural Support for Security and Privacy (HASP '19).

Digital Library

[16]

Chia che Tsai, Donald E. Porter, and Mona Vij. 2017. Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX. In 2017 USENIX Annual Technical Conference (USENIX ATC '17). Santa Clara, CA, 645--658.

[17]

Guoxing Chen, Sanchuan Chen, Yuan Xiao, Yinqian Zhang, Zhiqiang Lin, and Ten H Lai. 2019. Sgxpectre: Stealing intel secrets from sgx enclaves via speculative execution. In 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 142--157.

[18]

Victor Costan and Srinivas Devadas. 2016. Intel SGX Explained. IACR Cryptology ePrint Archive 2016, 086 (2016), 1--118.

[19]

Tu Dinh Ngoc, Bao Bui, Stella Bitchebe, Alain Tchana, Valerio Schiavoni, Pascal Felber, and Daniel Hagimont. 2019. Everything You Should Know About Intel SGX Performance on Virtualized Systems. In Abstracts of the 2019 SIGMETRICS/Performance Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS '19). 77--78.

[20]

Jonathan Frankle and Michael Carbin. 2018. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018).

[21]

Matt Fredrikson, Somesh Jha, and Thomas Ristenpart. 2015. Model inversion attacks that exploit confidence information and basic countermeasures. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. ACM, 1322--1333.

Digital Library

[22]

Matthew Fredrikson, Eric Lantz, Somesh Jha, Simon Lin, David Page, and Thomas Ristenpart. 2014. Privacy in pharmacogenetics: An end-to-end case study of personalized warfarin dosing. In 23rd USENIX Security Symposium (USENIX Security '14). 17--32.

[23]

Zhongshu Gu, Heqing Huang, Jialong Zhang, Dong Su, Hani Jamjoom, Ankita Lamba, Dimitrios Pendarakis, and Ian Molloy. 2018. YerbaBuena: Securing Deep Learning Inference Data via Enclave-based Ternary Model Partitioning. arXiv preprint arXiv:1807.00969 (2018).

[24]

Song Han, Huizi Mao, and William J Dally. 2015. Deep compression: Compressing deep neural networks with pruning, trained qare two popular schemesouantization and huffman coding. arXiv preprint arXiv:1510.00149 (2015).

[25]

Song Han, Jeff Pool, John Tran, and William Dally. 2015. Learning both weights and connections for efficient neural network. In Advances in neural information processing systems. 1135--1143.

[26]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition. 770--778.

[27]

Sanghyun Hong, Pietro Frigo, Yigitcan Kaya, Cristiano Giuffrida, and Tudor Dumitras. 2019. Terminal Brain Damage: Exposing the Graceless Degradation in Deep Neural Networks Under Hardware Fault Attacks. In 28th USENIX Security Symposium (USENIX Security 19). Santa Clara, CA, 497--514.

[28]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4700--4708.

[29]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2017. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. arXiv preprint arXiv:1712.05877v1 (2017).

[30]

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, and Dmitry Kalenichenko. 2018. Quantization and training of neural networks for efficient integer-arithmetic-only inference. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2704--2713.

[31]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. arXiv preprint arXiv:1408.5093 (2014).

[32]

Yiping Kang, Johann Hauswald, Cao Gao, Austin Rovinski, Trevor Mudge, Jason Mars, and Lingjia Tang. 2017. Neurosurgeon: Collaborative Intelligence Between the Cloud and Mobile Edge. In Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '17).

Digital Library

[33]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems. 1097--1105.

[34]

Roland Kunkel, Do Le Quoc, Franz Gregor, Sergei Arnautov, Pramod Bhatotia, and Christof Fetzer. 2019. TensorSCONE: A Secure Tensor-Flow Framework using Intel SGX. arXiv preprint arXiv:1902.04413 (2019).

[35]

Joshua Lind, Christian Priebe, Divya Muthukumaran, Dan O'Keeffe, Pierre-Louis Aublin, Florian Kelbert, Tobias Reiher, David Goltzsche, David Eyers, Rüdiger Kapitza, et al. 2017. Glamdring: Automatic Application Partitioning for Intel SGX. In 2017 USENIX Annual Technical Conference (USENIX ATC '17). 285--298.

[36]

Yingqi Liu, Shiqing Ma, Yousra Aafer, Wen-Chuan Lee, Juan Zhai, Weihang Wang, and Xiangyu Zhang. 2018. Trojaning attack on neural networks. In Proceedings of the 25th Network and Distributed System Security Symposium (NDSS 2018).

[37]

Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Proceedings of the 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (Chicago, IL, USA) (PLDI '05). 11.

Digital Library

[38]

Frank McKeen, Ilya Alexandrovich, Ittai Anati, Dror Caspi, Simon Johnson, Rebekah Leslie-Hurd, and Carlos Rozas. 2016. Intel Software Guard Extensions (Intel SGX) Support for Dynamic Memory Management Inside an Enclave. In Proceedings of the Hardware and Architectural Support for Security and Privacy 2016 (HASP '16).

Digital Library

[39]

Dirk Merkel. 2014. Docker: lightweight linux containers for consistent development and deployment. Linux journal 2014, 239 (2014), 2.

Digital Library

[40]

Kit Murdock, David Oswald, Flavio D Garcia, Jo Van Bulck, Daniel Gruss, and Frank Piessens. 2020. Plundervolt: Software-based Fault Injection Attacks against Intel SGX. In 2020 IEEE Symposium on Security and Privacy (SP '20).

[41]

Olga Ohrimenko, Felix Schuster, Cédric Fournet, Aastha Mehta, Sebastian Nowozin, Kapil Vaswani, and Manuel Costa. 2016. Oblivious multi-party machine learning on trusted processors. In 25th USENIX Security Symposium (USENIX Security '16). 619--636.

[42]

Meni Orenbach, Andrew Baumann, and Mark Silberstein. 2020. Autarky: closing controlled channels with self-paging enclaves. In Proceedings of the Fifteenth European Conference on Computer Systems. 1--16.

Digital Library

[43]

Meni Orenbach, Pavel Lifshits, Marina Minkin, and Mark Silberstein. 2017. Eleos: ExitLess OS services for SGX enclaves. In Proceedings of the Twelfth European Conference on Computer Systems. ACM, 238--253.

Digital Library

[44]

Sandro Pinto and Nuno Santos. 2019. Demystifying Arm TrustZone: A Comprehensive Survey. ACM Computing Surveys (CSUR) 51, 6 (2019), 130.

Digital Library

[45]

Minghai Qin, Chao Sun, and Dejan Vucinic. 2017. Robustness of Neural Networks against Storage Media Errors. (09 2017).

[46]

Joseph Redmon. 2013--2016. Darknet: Open Source Neural Networks in C. http://pjreddie.com/darknet/.

[47]

Joseph Redmon and Ali Farhadi. 2018. Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767 (2018).

[48]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, Alexander C. Berg, and Li Fei-Fei. 2015. ImageNet Large Scale Visual Recognition Challenge. International Journal of Computer Vision (IJCV) 115, 3 (2015), 211--252.

Digital Library

[49]

Fahad Shaon, Murat Kantarcioglu, Zhiqiang Lin, and Latifur Khan. 2017. SGX-BigMatrix: A practical encrypted data analytic framework with trusted processors. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1211--1228.

Digital Library

[50]

Mahmood Sharif, Sruti Bhagavatula, Lujo Bauer, and Michael K Reiter. 2016. Accessorize to a crime: Real and stealthy attacks on state-of-the-art face recognition. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. ACM, 1528--1540.

Digital Library

[51]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[52]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818--2826.

[53]

Meysam Taassori, Ali Shafiee, and Rajeev Balasubramonian. 2018. VAULT: Reducing paging overheads in SGX with efficient integrity verification structures. In ACM SIGPLAN Notices, Vol. 53. ACM, 665--678.

Digital Library

[54]

Shruti Tople, Karan Grover, Shweta Shinde, Ranjita Bhagwan, and Ramachandran Ramjee. 2018. Privado: Practical and secure DNN inference. arXiv preprint arXiv:1810.00602 (2018).

[55]

Florian Tramer and Dan Boneh. 2018. Slalom: Fast, verifiable and private execution of neural networks in trusted hardware. arXiv preprint arXiv:1806.03287 (2018).

[56]

Peter M VanNostrand, Ioannis Kyriazis, Michelle Cheng, Tian Guo, and Robert J Walls. 2019. Confidential Deep Learning: Executing Proprietary Models on Untrusted Devices. arXiv preprint arXiv:1908.10730 (2019).

[57]

Simon Wiedemann, Klaus-Robert Müller, and Wojciech Samek. 2019. Compact and computationally efficient representation of deep neural networks. IEEE transactions on neural networks and learning systems (2019).

[58]

Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1492--1500.

Cited By

Li QShen ZQin ZXie YZhang XDu TCheng SWang XYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge DeploymentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680786(3479-3488)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680786
Zhang ZGong CCai YYuan YLiu BLi DGuo YChen X(2024)No Privacy Left Outside: On the (In-)Security of TEE-Shielded DNN Partition for On-Device ML2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00052(3327-3345)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00052
Li YZeng DGut LGuo SZomaya A(2024)DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00065(635-644)Online publication date: 23-Jul-2024
https://doi.org/10.1109/ICDCS60910.2024.00065
Show More Cited By

Recommendations

A Novel Memory Block Management Scheme for PCM Using WOM-Code
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems

Phase Change Memory (PCM) is a promising DRAM replacement in embedded systems due to its attractive characteristics including low static power consumption and high density. However, long write latency is one of the major drawbacks in current PCM ...
WOM-Code Solutions for Low Latency and High Endurance in Phase Change Memory
This paper describes a write-once-memory-code phase change memory (WOM-code PCM) architecture for next-generation non-volatile memory applications. Specifically, we address the long latency of the write operation in PCM—attributed to PCM SET—...
DEUCE: Write-Efficient Encryption for Non-Volatile Memories
ASPLOS'15

Phase Change Memory (PCM) is an emerging Non Volatile Memory (NVM) technology that has the potential to provide scalable high-density memory systems. While the non-volatility of PCM is a desirable property in order to save leakage power, it also has the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SoCC '20: Proceedings of the 11th ACM Symposium on Cloud Computing

October 2020

535 pages

ISBN:9781450381376

DOI:10.1145/3419111

General Chair:
Rodrigo Fonseca
Microsoft and Brown University
,
Program Chairs:
Christina Delimitrou
Cornell University
,
Beng Chin Ooi
National University of Singapore

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 October 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Conference

SoCC '20

Sponsor:

SoCC '20: ACM Symposium on Cloud Computing

October 19 - 21, 2020

Virtual Event, USA

Acceptance Rates

SoCC '20 Paper Acceptance Rate 35 of 143 submissions, 24%;

Overall Acceptance Rate 169 of 722 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
541
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)3

Reflects downloads up to 14 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li QShen ZQin ZXie YZhang XDu TCheng SWang XYin JCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)TransLinkGuard: Safeguarding Transformer Models Against Model Stealing in Edge DeploymentProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680786(3479-3488)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680786
Zhang ZGong CCai YYuan YLiu BLi DGuo YChen X(2024)No Privacy Left Outside: On the (In-)Security of TEE-Shielded DNN Partition for On-Device ML2024 IEEE Symposium on Security and Privacy (SP)10.1109/SP54263.2024.00052(3327-3345)Online publication date: 19-May-2024
https://doi.org/10.1109/SP54263.2024.00052
Li YZeng DGut LGuo SZomaya A(2024)DNN Partitioning and Assignment for Distributed Inference in SGX Empowered Edge Cloud2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00065(635-644)Online publication date: 23-Jul-2024
https://doi.org/10.1109/ICDCS60910.2024.00065
Yang MYi WWang JHu HXu XLi Z(2024)Penetralium: Privacy-preserving and memory-efficient neural network inference at the edgeFuture Generation Computer Systems10.1016/j.future.2024.03.008156(30-41)Online publication date: Jul-2024
https://doi.org/10.1016/j.future.2024.03.008
Shu JShu J(2024)Storage SecurityData Storage Architectures and Technologies10.1007/978-981-97-3534-1_10(271-309)Online publication date: 28-Aug-2024
https://doi.org/10.1007/978-981-97-3534-1_10
Feng HPang TDu CChen WYan SLin M(2024)BAFFLE: A Baseline of Backpropagation-Free Federated LearningComputer Vision – ECCV 202410.1007/978-3-031-73226-3_6(89-109)Online publication date: 1-Nov-2024
https://doi.org/10.1007/978-3-031-73226-3_6
Gao HYue CDinh THuang ZOoi B(2023)Enabling Secure and Efficient Data Analytics Pipeline Evolution with Trusted Execution EnvironmentProceedings of the VLDB Endowment10.14778/3603581.360358916:10(2485-2498)Online publication date: 8-Aug-2023
https://dl.acm.org/doi/10.14778/3603581.3603589
Banerjee SWei SRamrakhyani PTiwari M(2023)Triton: Software-Defined Threat Model for Secure Multi-Tenant ML Inference AcceleratorsProceedings of the 12th International Workshop on Hardware and Architectural Support for Security and Privacy10.1145/3623652.3623672(19-28)Online publication date: 29-Oct-2023
https://dl.acm.org/doi/10.1145/3623652.3623672
Will NMaziero C(2023)Intel Software Guard Extensions Applications: A SurveyACM Computing Surveys10.1145/359302155:14s(1-38)Online publication date: 17-Jul-2023
https://dl.acm.org/doi/10.1145/3593021
Hu BWang YCheng JZhao TXie YGuo XChen Y(2023)Secure and Efficient Mobile DNN Using Trusted Execution EnvironmentsProceedings of the 2023 ACM Asia Conference on Computer and Communications Security10.1145/3579856.3582820(274-285)Online publication date: 10-Jul-2023
https://dl.acm.org/doi/10.1145/3579856.3582820
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten