skip to main content
10.1145/3651890.3672216acmconferencesArticle/Chapter ViewAbstractPublication PagescommConference Proceedingsconference-collections
research-article

YuanRong: A Production General-purpose Serverless System for Distributed Applications in the Cloud

Published: 04 August 2024 Publication History

Abstract

We design, implement, and evaluate YuanRong, the first production general-purpose serverless platform with a unified programming interface, multi-language runtime, and a distributed computing kernel for cloud-based applications. YuanRong addresses many limitations of existing Function-as-a-Service (FaaS) systems, particularly in performance and lack of important features. First, our fast function system supports sub-millisecond function invocation and locality-aware hierarchical scheduling. Second, our multi-semantic built-in data system achieves object exchange latency of 200 microseconds, enabling end-to-end latency of 2 milliseconds for streaming elements at 5Gbps throughput. Third, the extensible, portable Service Bridge bridges stateless and stateful operations, allowing connection reuse and distributed transactions, and offers unified backend abstraction for multi-cloud portability. YuanRong has been deployed for over 3 years at Huawei across nearly 20 datacenter regions, processing up to 30 billion requests per day on more than 100,000 CPU cores, with a daily average CPU usage of 53%. It serves various serverless workloads, including enhanced FaaS services, microservices, data analytics, deep model training and serving, and some HPC workloads. Our experience shows that Spring-based microservices can be migrated to YuanRong within one day, reducing resource costs by 90%, demonstrating its generality and efficiency in supporting a broad spectrum of applications.

References

[1]
Istemi Ekin Akkus, Ruichuan Chen, Ivica Rimac, Manuel Stein, Klaus Satzke, Andre Beck, Paarijaat Aditya, and Volker Hilt. 2018. SAND: Towards HighPerformance Serverless Computing. In Proceedings of USENIX ATC.
[2]
Lixiang Ao. 2022. Framework and Platform Support for General Purpose Serverless Computing. Ph. D. Dissertation. University of California, San Diego.
[3]
Lixiang Ao, Liz Izhikevich, Geoffrey M Voelker, and George Porter. 2018. Sprocket: A serverless video processing framework. In Proceedings of the ACM Symposium on Cloud Computing. 263--274.
[4]
Lixiang Ao, George Porter, and Geoffrey M Voelker. 2022. Faasnap: Faas made fast using snapshot-based vms. In Proceedings of the Seventeenth European Conference on Computer Systems. 730--746.
[5]
Vivek M Bhasi, Jashwant Raj Gunasekaran, Prashanth Thinakaran, Cyan Subhra Mishra, Mahmut Taylan Kandemir, and Chita Das. 2021. Kraken: Adaptive container provisioning for deploying dynamic dags in serverless platforms. In Proceedings of the ACM Symposium on Cloud Computing. 153--167.
[6]
BigQuery. 2023. Google Cloud-Products-Data Analytics. Retrieved August 25, 2023 from https://cloud.google.com/bigquery
[7]
James Cadden, Thomas Unger, Yara Awad, Han Dong, Orran Krieger, and Jonathan Appavoo. 2020. SEUSS: skip redundant paths to make serverless fast. In Proceedings of the Fifteenth European Conference on Computer Systems. 1--15.
[8]
Joao Carreira, Pedro Fonseca, Alexey Tumanov, Andrew Zhang, and Randy Katz. 2019. Cirrus: A serverless framework for end-to-end ml workflows. In Proceedings of the ACM Symposium on Cloud Computing. 13--24.
[9]
Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu, and Yue Cheng. 2020. Wukong: A Scalable and Locality-Enhanced Framework for Server-less Parallel Computing. In Proceedings of ACM SoCC.
[10]
Dapr. 2023. Distributed Application Runtime. Retrieved August 17, 2023 from https://dapr.io/
[11]
Luke Nicholas Darlow, Artjom Joosen, Martin Asenov, Qiwen Deng, Jianfeng Wang, and Adam Barker. 2023. FoldFormer: sequence folding and seasonal attention for fine-grained long-term FaaS forecasting. In Proceedings of the 3rd Workshop on Machine Learning and Systems. 71--77.
[12]
Databricks. 2023. Data Lakehouse Architecture and AI Company. Retrieved August 25, 2023 from https://www.databricks.com/
[13]
Amazon DynamoDB. 2023. AWS-Products-Database. Retrieved August 15, 2023 from https://aws.amazon.com/cn/dynamodb/
[14]
Etcd. 2023. A distributed, reliable key-value store for the most critical data of a distributed system. Retrieved August 15, 2023 from https://etcd.io/
[15]
Apache Flink. 2023. Stateful Computations over Data Streams. Retrieved August 15, 2023 from https://flink.apache.org/
[16]
Sadjad Fouladi, Francisco Romero, Dan Iter, Qian Li, Shuvo Chatterjee, Christos Kozyrakis, Matei Zaharia, and Keith Winstein. 2019. From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers. In Proceedings of USENIX ATC.
[17]
Sadjad Fouladi, Riad S Wahby, Brennan Shacklett, Karthikeyan Vasuki Balasubramaniam, William Zeng, Rahul Bhalerao, Anirudh Sivaraman, George Porter, and Keith Winstein. 2017. Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads. In 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI). 363--376.
[18]
Huawei FunctionGraph. 2023. Huawei Cloud-Products-Compute. Retrieved August 15, 2023 from https://www.huaweicloud.com/intl/en-us/product/functiongraph.html
[19]
Google Cloud Functions. 2023. Cloud Functions-Serverless Computing-Products. Retrieved August 15, 2023 from https://cloud.google.com/functions
[20]
Microsoft Azure Functions. 2023. Azure Functions-Compute-Products. Retrieved August 15, 2023 from https://azure.microsoft.com/en-us/products/functions/
[21]
Phani Kishore Gadepalli, Sean McBride, Gregor Peach, Ludmila Cherkasova, and Gabriel Parmer. 2020. Sledge: A Serverless-First, Light-Weight Wasm Runtime for the Edge. In Proceedings of ACM Middleware.
[22]
Paul H Hargrove and Jason C Duell. 2006. Berkeley lab checkpoint/restart (blcr) for linux clusters. In Journal of Physics: Conference Series, Vol. 46. IOP Publishing, 494.
[23]
Joseph M Hellerstein, Jose Faleiro, Joseph E Gonzalez, Johann Schleier-Smith, Vikram Sreekanti, Alexey Tumanov, and Chenggang Wu. 2018. Serverless computing: One step forward, two steps back. arXiv preprint arXiv:1812.03651 (2018).
[24]
Benjamin Hindman. 2023. libprocess, a concurrent and asynchronous programming library. Retrieved August 15, 2023 from https://github.com/3rdparty/libprocess
[25]
Zhipeng Jia and Emmett Witchel. 2021. Boki: Stateful Serverless Computing with Shared Logs. In Proceedings of the ACM SOSP.
[26]
Zhipeng Jia and Emmett Witchel. 2021. Nightcore: Efficient and Scalable Serverless Computing for Latency-Sensitive, Interactive Microservices. In Proceedings of ACM ASPLOS.
[27]
Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, and Ce Zhang. 2021. Towards demystifying serverless machine learning training. In Proceedings of the 2021 International Conference on Management of Data. 857--871.
[28]
Zewen Jin, Yiming Zhu, Jiaan Zhu, Dongbo Yu, Cheng Li, Ruichuan Chen, Istemi Ekin Akkus, and Yinlong Xu. 2021. Lessons learned from migrating complex stateful applications onto serverless platforms. In Proceedings of the 12th ACM SIGOPS Asia-Pacific Workshop on Systems. 89--96.
[29]
Eric Jonas, Qifan Pu, Shivaram Venkataraman, Ion Stoica, and Benjamin Recht. 2017. Occupy the cloud: Distributed computing for the 99%. In Proceedings of the 2017 symposium on cloud computing. 445--451.
[30]
Artjom Joosen, Ahmed Hassan, Martin Asenov, Rajkarn Singh, Luke Darlow, Jianfeng Wang, and Adam Barker. 2023. How does it function? Characterizing long-term trends in production serverless workloads. In Proceedings of the 2023 ACM Symposium on Cloud Computing. 443--458.
[31]
Apache Kafka. 2023. An Open-source Distributed Event Streaming Platform. Retrieved August 15, 2023 from https://kafka.apache.org/
[32]
Ana Klimovic, Yawen Wang, Christos Kozyrakis, Patrick Stuedi, Jonas Pfefferle, and Animesh Trivedi. 2018. Understanding ephemeral storage for serverless analytics. In 2018 USENIX Annual Technical Conference (ATC). 789--794.
[33]
Ana Klimovic, Yawen Wang, Patrick Stuedi, Animesh Trivedi, Jonas Pfefferle, and Christos Kozyrakis. 2018. Pocket: Elastic Ephemeral Storage for Serverless Analytics. In Proceedings of USENIX OSDI. 427--444.
[34]
Knative. 2023. Serverless Containers in Kubernetes Environments. Retrieved August 15, 2023 from https://knative.dev/docs/
[35]
AWS Lambda. 2023. AWS Lambda-Compute-Products. Retrieved August 15, 2023 from https://aws.amazon.com/lambda/
[36]
Zijun Li, Linsong Guo, Quan Chen, Jiagan Cheng, Chuhao Xu, Deze Zeng, Zhuo Song, Tao Ma, Yong Yang, Chao Li, et al. 2022. Help Rather Than Recycle: Alleviating Cold Startup in Serverless Computing Through {Inter-Function} Container Sharing. In 2022 USENIX Annual Technical Conference (USENIX ATC 22). 69--84.
[37]
Zijun Li, Yushi Liu, Linsong Guo, Quan Chen, Jiagan Cheng, Wenli Zheng, and Minyi Guo. 2022. FaaSFlow: Enable Efficient Workflow Execution for Function-as-a-Service. In Proceedings of ACM ASPLOS.
[38]
Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I. Jordan, and Ion Stoica. 2018. Ray: A Distributed Framework for Emerging AI Applications. In Proceedings of USENIX OSDI.
[39]
OpenFaaS. 2023. Serverless Functions, Made Simple. Retrieved August 15, 2023 from https://www.openfaas.com/
[40]
Apache OpenWhisk. 2016. Open Source Serverless Cloud Platform. Retrieved August 15, 2023 from https://openwhisk.apache.org/
[41]
Li Pan, Lin Wang, Shutong Chen, and Fangming Liu. 2022. Retention-aware container caching for serverless edge computing. Proc. of IEEE INFOCOM, IEEE (2022).
[42]
Qifan Pu, Shivaram Venkataraman, and Ion Stoica. 2019. Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure. In Proceedings of USENIX NSDI.
[43]
Taavi Rehemägi. 2018. Companies using Serverless in Production. Retrieved August 15, 2023 from https://dashbird.io/blog/companies-using-serverless-in-production/
[44]
Francisco Romero, Gohar Irfan Chaudhry, Íñigo Goiri, Pragna Gopa, Paul Batum, Neeraja J. Yadwadkar, Rodrigo Fonseca, Christos Kozyrakis, and Ricardo Bianchini. 2021. Faa$T: A Transparent Auto-Scaling Cache for Serverless Applications. In Proceedings of the ACM SoCC.
[45]
Rohan Basu Roy, Tirthak Patel, and Devesh Tiwari. 2022. Icebreaker: Warming serverless functions better with heterogeneity. In Proceedings of the 27th ACM ASPLOS. 753--767.
[46]
Josep Sampé, Gil Vernik, Marc Sánchez-Artigas, and Pedro García-López. 2018. Serverless Data Analytics in the IBM Cloud. In Proceedings of ACM/IFIP Middleware.
[47]
Johann Schleier-Smith, Vikram Sreekanti, Anurag Khandelwal, Joao Carreira, Neeraja J. Yadwadkar, Raluca Ada Popa, Joseph E. Gonzalez, Ion Stoica, and David A. Patterson. 2021. What Serverless Computing is and Should Become: The next Phase of Cloud Computing. Commun. ACM 64, 5 (apr 2021), 76--84.
[48]
Mohammad Shahrad, Rodrigo Fonseca, Inigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the wild: Characterizing and optimizing the serverless workload at a large cloud provider. arXiv preprint arXiv:2003.03423 (2020).
[49]
Vaishaal Shankar, Karl Krauth, Kailas Vodrahalli, Qifan Pu, Benjamin Recht, Ion Stoica, Jonathan Ragan-Kelley, Eric Jonas, and Shivaram Venkataraman. 2020. Serverless Linear Algebra. In Proceedings of ACM SoCC.
[50]
Simon Shillaker and Peter Pietzuch. 2020. Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing. In Proceedings of USENIX ATC.
[51]
Wonseok Shin, Wook-Hee Kim, and Changwoo Min. 2022. Fireworks: A fast, efficient, and safe serverless framework using vm-level post-jit snapshot. In Proceedings of the Seventeenth European Conference on Computer Systems. 663--677.
[52]
Arjun Singhvi, Arjun Balasubramanian, Kevin Houck, Mohammed Danish Shaikh, Shivaram Venkataraman, and Aditya Akella. 2021. Atoll: A Scalable Low-Latency Serverless Platform. In Proceedings of the ACM Symposium on Cloud Computing.
[53]
AWS Lambda SnapStart. 2023. Improving startup performance with Lambda SnapStart. Retrieved August 15, 2023 from https://docs.aws.amazon.com/lambda/latest/dg/snapstart.html
[54]
Vikram Sreekanti, Chenggang Wu, Xiayue Charles Lin, Johann Schleier-Smith, Joseph E. Gonzalez, Joseph M. Hellerstein, and Alexey Tumanov. 2020. Cloudburst: Stateful Functions-as-a-Service. Proc. VLDB Endow. 13, 12 (July 2020), 2438--2452.
[55]
Azure Blob Storage. 2023. Azure-Products-Storage. Retrieved August 15, 2023 from https://azure.microsoft.com/en-us/products/storage/blobs
[56]
John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2021. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads. In Proceedings of USENIX OSDI.
[57]
Dmitrii Ustiugov, Plamen Petrov, Marios Kogias, Edouard Bugnion, and Boris Grot. 2021. Benchmarking, analysis, and optimization of serverless function snapshots. In Proceedings of the 26th ACM ASPLOS. 559--572.
[58]
Kai-Ting Amy Wang, Rayson Ho, and Peng Wu. 2019. Replayable execution optimized for page sharing for a managed runtime environment. In Proceedings of the Fourteenth EuroSys Conference 2019. 1--16.
[59]
Stephanie Wang, Eric Liang, Edward Oakes, Ben Hindman, Frank Sifei Luan, Audrey Cheng, and Ion Stoica. 2021. Ownership: A Distributed Futures System for {Fine-Grained} Tasks. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21). 671--686.
[60]
Minchen Yu, Tingjia Cao, Wei Wang, and Ruichuan Chen. 2023. Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing. In 20th USENIX Symposium on Networked Systems Design and Implementation (NSDI 23). USENIX Association, Boston, MA, 1489--1504. https://www.usenix.org/conference/nsdi23/presentation/yu
[61]
Wen Zhang, Vivian Fang, Aurojit Panda, and Scott Shenker. 2020. Kappa: A Programming Framework for Serverless Computing. In Proceedings of ACM SoCC.

Cited By

View all
  • (2024)SURE: Secure Unikernels Make Serverless Computing Rapid and EfficientProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698558(668-688)Online publication date: 20-Nov-2024
  • (2024)Streamlining Cloud-Native Application Development and Deployment with Robust EncapsulationProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698552(847-865)Online publication date: 20-Nov-2024
  • (2024)ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling With Heterogeneous ContainerIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345407135:11(1989-2005)Online publication date: 3-Sep-2024

Index Terms

  1. YuanRong: A Production General-purpose Serverless System for Distributed Applications in the Cloud

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ACM SIGCOMM '24: Proceedings of the ACM SIGCOMM 2024 Conference
    August 2024
    1033 pages
    ISBN:9798400706141
    DOI:10.1145/3651890
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 August 2024

    Check for updates

    Author Tags

    1. FaaS
    2. serverless
    3. cloud computing
    4. distributed systems

    Qualifiers

    • Research-article

    Conference

    ACM SIGCOMM '24
    Sponsor:
    ACM SIGCOMM '24: ACM SIGCOMM 2024 Conference
    August 4 - 8, 2024
    NSW, Sydney, Australia

    Acceptance Rates

    Overall Acceptance Rate 462 of 3,389 submissions, 14%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,355
    • Downloads (Last 6 weeks)159
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SURE: Secure Unikernels Make Serverless Computing Rapid and EfficientProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698558(668-688)Online publication date: 20-Nov-2024
    • (2024)Streamlining Cloud-Native Application Development and Deployment with Robust EncapsulationProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698552(847-865)Online publication date: 20-Nov-2024
    • (2024)ComboFunc: Joint Resource Combination and Container Placement for Serverless Function Scaling With Heterogeneous ContainerIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.345407135:11(1989-2005)Online publication date: 3-Sep-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media