research-article

When Serverless Computing Meets Different Degrees of Customization for DNN Inference

Authors:
Moohyun Song

Computer Science. Kookmin Univ., Seoul, South Korea

Computer Science. Kookmin Univ., Seoul, South Korea

0009-0003-9392-7808
View Profile

,
Yoonseo Hur

Computer Science. Kookmin Univ., Seoul, South Korea

Computer Science. Kookmin Univ., Seoul, South Korea

0009-0001-3521-8021
View Profile

,
Kyungyong Lee

Computer Science. Kookmin Univ., Seoul, South Korea

Computer Science. Kookmin Univ., Seoul, South Korea

0000-0003-0312-4386
View Profile

WoSC '23: Proceedings of the 9th International Workshop on Serverless ComputingDecember 2023Pages 42–47https://doi.org/10.1145/3631295.3631400

Published:11 December 2023Publication History

WoSC '23: Proceedings of the 9th International Workshop on Serverless Computing

Pages 42–47

ABSTRACT

Serverless computing provides a method to develop application services without the burden of run-time execution environment management overhead. Since the initial offerings of serverless computing using function-as-a-service (FaaS), other variants of execution environments have been proposed, such as a special-purpose FaaS (SPF) for deep neural network (DNN) inference and a serverless container service (SCS) for general web applications. This paper qualitatively summarizes the characteristics of a general-purpose FaaS (GPF), SPF, and SCS from the perspective of customizability when setting up execution environments. To judge whether various serverless computing environments can be feasible solutions for an interactive DNN model inference application, we conduct extensive experiments and conclude that there are rooms for performance improvement serverless DNN inference, and allowing a custom environment setup can make the serverless computing platform for an interactive DNN application.

References

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. TensorFlow: a system for large-scale machine learning. In 12th USENIX symposium on operating systems design and implementation (OSDI 16). 265--283.Google ScholarDigital Library
Jaeghang Choi and Kyungyong Lee. 2020. Evaluation of Network File System as a Shared Data Storage in Serverless Computing. In Proceedings of the 2020 Sixth International Workshop on Serverless Computing (Delft, Netherlands) (WoSC'20). Association for Computing Machinery, New York, NY, USA, 25--30. https://doi.org/10.1145/3429880.3430096Google ScholarDigital Library
U. Choi and K. Lee. 2022. Dense or Sparse: Elastic SPMM Implementation for Optimal Big-Data Processing. IEEE Transactions on Big Data 01 (aug 2022), 1--17. https://doi.org/10.1109/TBDATA.2022.3199197Google ScholarCross Ref
Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In NAACL 2019.Google Scholar
Joseph M. Hellerstein, Jose M. Faleiro, Joseph Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2019. Serverless Computing: One Step Forward, Two Steps Back. In 9th Biennial Conference on Innovative Data Systems Research, CIDR 2019, Asilomar, CA, USA, January 13-16, 2019, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2019/papers/p119-hellerstein-cidr19.pdfGoogle Scholar
Glenn Jocher, Ayush Chaurasia, Alex Stoken, Jirka Borovec, NanoCode012, Yonghye Kwon, Kalen Michael, TaoXie, Jiacong Fang, imyhxy, Lorna, Zeng Yifu, Colin Wong, Abhiram V, Diego Montes, Zhiqiang Wang, Cristi Fati, Jebastin Nadar, Laughing, UnglvKitDe, Victor Sonck, tkianai, yxNONG, Piotr Skalski, Adam Hogan, Dhruv Nair, Max Strobel, and Mrinal Jain. 2022. ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation. https://doi.org/10.5281/zenodo.7347926Google ScholarCross Ref
Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the Cloud: Distributed Computing for the 99%. In Proceedings of the 2017 Symposium on Cloud Computing (Santa Clara, California) (SoCC '17). ACM, New York, NY, USA, 445--451. https://doi.org/10.1145/3127479.3128601Google ScholarDigital Library
Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Jayant Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, and David A. Patterson. 2019. Cloud Programming Simplified: A Berkeley View on Serverless Computing. CoRR abs/1902.03383 (2019). arXiv:1902.03383 http://arxiv.org/abs/1902.03383Google Scholar
J. Kim and K. Lee. 2019. FunctionBench: A Suite of Workloads for Serverless Cloud Function Service. In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD). https://doi.org/10.1109/CLOUD.2019.00091Google ScholarCross Ref
Jeongchul Kim and Kyungyong Lee. 2019. Practical Cloud Workloads for Serverless FaaS. In Proceedings of the ACM Symposium on Cloud Computing (Santa Cruz, CA, USA) (SoCC '19). ACM, New York, NY, USA.Google ScholarDigital Library
Josep Sampé, Marc Sánchez-Artigas, Gil Vernik, Ido Yehekzel, and Pedro García-López. 2023. Outsourcing Data Processing Jobs With Lithops. IEEE Transactions on Cloud Computing 11, 1 (2023), 1026--1037. https://doi.org/10.1109/TCC.2021.3129000Google ScholarCross Ref
Marc Sánchez-Artigas and Germán T. Eizaguirre. 2022. A Seer Knows Best: Optimized Object Storage Shuffling for Serverless Analytics. In Proceedings of the 23rd ACM/IFIP International Middleware Conference (Quebec, QC, Canada) (Middleware '22). Association for Computing Machinery, New York, NY, USA, 148--160. https://doi.org/10.1145/3528535.3565241Google ScholarDigital Library
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. 2018. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4510--4520. https://doi.org/10.1109/CVPR.2018.00474Google ScholarCross Ref
Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J Yadwadkar, Raluca Ada Popa, Joseph E Gonzalez, Ion Stoica, and David A Patterson. 2021. What serverless computing is and should become: The next phase of cloud computing. Commun. ACM 64, 5 (2021), 76--84.Google ScholarDigital Library
M. Son and K. Lee. 2018. Distributed Matrix Multiplication Performance Estimator for Machine Learning Jobs in Cloud Computing. In 2018 IEEE 11th International Conference on Cloud Computing (CLOUD), Vol. 00. 638--645. https://doi.org/10.1109/CLOUD.2018.00088Google ScholarCross Ref
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens, and Zbigniew Wojna. 2015. Rethinking the Inception Architecture for Computer Vision. CoRR abs/1512.00567 (2015). arXiv:1512.00567 http://arxiv.org/abs/1512.00567Google Scholar
Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In 2018 USENIX Annual Technical Conference (USENIX ATC 18). USENIX Association, Boston, MA, 133--146. https://www.usenix.org/conference/atc18/presentation/wang-liangGoogle ScholarDigital Library
Reza Yazdani Aminabadi, Samyam Rajbhandari, Minjia Zhang, Ammar Ahmad Awan, Cheng Li, Du Li, Elton Zheng, Jeff Rasley, Shaden Smith, Olatunji Ruwase, and Yuxiong He. 2022. DeepSpeed Inference: Enabling Efficient Inference of Transformer Models at Unprecedented Scale. Technical Report MSR-TR-2022-21. Microsoft. https://www.microsoft.com/en-us/research/publication/deepspeed-inference-enabling-efficient-inference-of-transformer-models-at-unprecedented-scale/Google Scholar

Index Terms

When Serverless Computing Meets Different Degrees of Customization for DNN Inference
1. Computer systems organization
  1. Architectures
    1. Distributed architectures
      1. Cloud computing

Recommendations

All-you-can-inference: serverless DNN model inference suite
WoSC '22: Proceedings of the Eighth International Workshop on Serverless Computing

Serverless computing becomes prevalent and is widely adopted for various applications. Deep learning inference tasks are appropriate to be deployed using a serverless architecture due to the nature of fluctuating task arrival events. When serving a Deep ...
Read More
Supporting Multi-Provider Serverless Computing on the Edge
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing

Serverless computing has recently emerged as a new execution model for cloud computing, in which service providers offer compute runtimes, also known as Function-as-a-Service (FaaS) platforms, allowing users to develop, execute and manage application ...
Read More
Towards Seamless Serverless Computing Across an Edge-Cloud Continuum
UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing

Serverless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

WoSC '23: Proceedings of the 9th International Workshop on Serverless Computing
December 2023
68 pages
ISBN:9798400704550
DOI:10.1145/3631295

Copyright © 2023 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 December 2023
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dnn inference
serverless computing
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Upcoming Conference

MIDDLEWARE '24

25th International Middleware Conference

December 2 - 6, 2024

Hong Kong , Hong Kong
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 73
  Total Downloads
- Downloads (Last 12 months)73
- Downloads (Last 6 weeks)9
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

When Serverless Computing Meets Different Degrees of Customization for DNN Inference

WoSC '23: Proceedings of the 9th International Workshop on Serverless Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

All-you-can-inference: serverless DNN model inference suite

Supporting Multi-Provider Serverless Computing on the Edge

Towards Seamless Serverless Computing Across an Edge-Cloud Continuum

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

When Serverless Computing Meets Different Degrees of Customization for DNN Inference

WoSC '23: Proceedings of the 9th International Workshop on Serverless Computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

All-you-can-inference: serverless DNN model inference suite

Supporting Multi-Provider Serverless Computing on the Edge

Towards Seamless Serverless Computing Across an Edge-Cloud Continuum

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media