skip to main content
10.1145/3590140.3629115acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article

Kernel-as-a-Service: A Serverless Programming Model for Heterogeneous Hardware Accelerators

Published: 27 November 2023 Publication History

Abstract

With the slowing of Moore's law and decline of Dennard scaling, computing systems increasingly rely on specialized hardware accelerators in addition to general-purpose compute units. Increased hardware heterogeneity necessitates disaggregating applications into workflows of fine-grained tasks that run on a diverse set of CPUs and accelerators. Current accelerator delivery models cannot support such applications efficiently, as (1) the overhead of managing accelerators erases performance benefits for fine-grained tasks; (2) exclusive accelerator use per task leads to underutilization; and (3) specialization increases complexity for developers.
We propose adopting concepts from Function-as-a-Service (FaaS), which has solved these challenges for general-purpose CPUs in cloud computing. Kernel-as-a-Service (KaaS) is a novel serverless programming model for generic compute accelerators that aids heterogeneous workflows by combining the ease-of-use of higher-level abstractions with the performance of low-level hand-tuned code. We evaluate KaaS with a focus on the breadth of the idea and its generality to diverse architectures rather than on an in-depth implementation for a single accelerator. Using proof-of-concept prototypes, we show that this programming model provides performance, performance efficiency, and ease-of-use benefits across a diverse range of compute accelerators. Despite increased levels of abstraction, when compared to a naive accelerator implementation, KaaS reduces completion times for fine-grained tasks by up to 96.0% (GPU), 68.4% (FPGA), 98.6% (TPU), and 34.9% (QPU) in our experiments.

References

[1]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Google Research. Retrieved May 10, 2023 from https://www.tensorflow.org/
[2]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2022. tf.nn.conv2d / TensorFlow v2.11.0. Google Research. Retrieved December 1, 2022 from https://www.tensorflow.org/api_docs/python/tf/nn/conv2d
[3]
Giovanni Agosta, William Fornaciari, Giuseppe Massari, Anna Pupykina, Federico Reghenzani, and Michele Zanella. 2018. Managing Heterogeneous Resources in HPC Systems. In Proceedings of the 9th Workshop and 7th Workshop on Parallel Programming and RunTime Management Techniques for Manycore Architectures and Design Tools and Architectures for Multicore Embedded Computing Platforms (Manchester, United Kingdom) (PARAM-DITAM '18). Association for Computing Machinery (ACM), New York, NY, USA, 7--12. https://doi.org/10.1145/3183767.3183769
[4]
AMD Xilinx. 2022. Pynq: Python Productivity for Zynq. Retrieved December 2, 2022 from http://pynq.io
[5]
Krste Asanović. 2014. FireBox: A Hardware Building Block for 2020 Warehouse-Scale Computers. In Proceedings of the 12th USENIX Conference on File and Storage Technologies (Santa Clara, CA, USA) (FAST '14). USENIX, Berkeley, CA, USA.
[6]
Jose Antonio Ayala-Barbosa and Paul Erick Mendez-Monroy. 2022. A new preemptive task scheduling framework for heterogeneous embedded systems. In Proceedings of the 2022 8th International Conference on Computer Technology Applications (Vienna, Austria) (ICCTA '22). Association for Computing Machinery (ACM), New York, NY, USA, 77--84. https://doi.org/10.1145/3543712.3543756
[7]
Ioana Baldini, Perry Cheng, Stephen J. Fink, Nick Mitchell, Vinod Muthusamy, Rodric Rabbah, Philippe Suter, and Olivier Tardieu. 2017. The Serverless Trilemma: Function Composition for Serverless Computing. In Proceedings of the 2017 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software (Vancouver, BC, Canada) (Onward! 2017). Association for Computing Machinery (ACM), New York, NY, USA, 89--103. https://doi.org/10.1145/3133850.3133855
[8]
Kirk M. Bresniker, Paolo Faraboschi, Avi Mendelson, Dejan Milojicic, Timothy Roscoe, and Robert N. M. Watson. 2019. Rack-Scale Capabilities: Fine-Grained Protection for Large-Scale Memories. Computer 52, 2 (Feb. 2019), 52--62. https://doi.org/10.1109/MC.2018.2888769
[9]
Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José MM Montiel, and Juan D. Tardós. 2021. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM. IEEE Transactions on Robotics 37, 6 (May 2021), 1874--1890. https://doi.org/10.1109/TRO.2021.3075644
[10]
Yudong Cao, Jonathan Romero, Jonathan P. Olson, Matthias Degroote, Peter D. Johnson, Mária Kieferová, Ian D. Kivlichan, Tim Menke, Borja Peropadre, Nicolas P. D. Sawaya, Sukin Sim, Libor Veis, and Alán Aspuru-Guzik. 2019. Quantum Chemistry in the Age of Quantum Computing. Chemical Reviews 119, 19 (Aug. 2019), 10856--10915. https://doi.org/10.1021/acs.chemrev.8b00803
[11]
Adrian Caulfield, Paolo Costa, and Monia Ghobadi. 2018. Beyond SmartNICs: Towards a Fully Programmable Cloud. In Proceedings of the 19th International Conference on High Performance Switching and Routing (Bucharest, Romania) (HPSR '18). IEEE, New York, NY, USA, 1--6. https://doi.org/10.1109/HPSR.2018.8850757
[12]
Ryan Chard, Yadu Babuji, Zhuozhao Li, Tyler Skluzacek, Anna Woodard, Ben Blaiszik, Ian Foster, and Kyle Chard. 2020. funcX: A Federated Function Serving Fabric for Science. In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing (Virtual Event, USA) (HPDC '20). Association for Computing Machinery (ACM), New York, NY, USA, 65--76. https://doi.org/10.1145/3369583.3392683
[13]
Marcin Copik, Marcin Chrapek, Alexandru Calotoiu, and Torsten Hoefler. 2022. Software Resource Disaggregation for HPC with Serverless Computing. Technical Report. Scalable Parallel Computing Lab, ETH Zürich, Zurich, Switzerland.
[14]
Marcin Copik, Konstantin Taranov, Alexandru Calotoiu, and Torsten Hoefler. 2023. rFaaS: Enabling High Performance Serverless with RDMA and Leases. In Proceedings of the 37th IEEE International Parallel & Distributed Processing Symposium (St. Petersburg, FL, USA) (IPDPDS '23). IEEE, New York, NY, USA.
[15]
Marco Cuturi and Mathieu Blondel. 2017. Soft-DTW: A Differentiable Loss Function for Time-Series. In Proceedings of the 34th International Conference on Machine Learning (Sydney, NSW, Australia) (ICML '17). Journal of Machine Learning Research, 894--903.
[16]
Bradley Denby and Brandon Lucia. 2020. Orbital Edge Computing: Nanosatellite Constellations as a New Class of Computer System. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS '20). Association for Computing Machinery (ACM), New York, NY, USA, 939--954. https://doi.org/10.1145/3373376.3378473
[17]
Aditya Dhakal, Sameer G. Kulkarni, and K. K. Ramakrishnan. 2020. GSLICE: Controlled Spatial Sharing of GPUs for a Scalable Inference Platform. In Proceedings of the 11th ACM Symposium on Cloud Computing (Virtual Event, USA) (SoCC '20). Association for Computing Machinery (ACM), New York, NY, USA, 492--506. https://doi.org/10.1145/3419111.3421284
[18]
Aditya Dhakal, Xukan Ran, Yunshu Wang, Jiasi Chen, and K. K. Ramakrishnan. 2022. SLAM-Share: Visual Simultaneous Localization and Mapping for Real-Time Multi-User Augmented Reality. In Proceedings of the 18th International Conference on Emerging Networking EXperiments and Technologies (Rome, Italy) (CoNEXT '22). Association for Computing Machinery (ACM), New York, NY, USA, 293--306. https://doi.org/10.1145/3555050.3569142
[19]
Dong Du, Qingyuan Liu, Xueqiang Jiang, Yubin Xia, Binyu Zang, and Haibo Chen. 2022. Serverless Computing on Heterogeneous Computers. In Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS '22). Association for Computing Machinery (ACM), New York, NY, USA, 797--813. https://doi.org/10.1145/3503222.3507732
[20]
Nicolas Dube, Duncan Roweth, Paolo Faraboschi, and Dejan Milojicic. 2021. Future of HPC: The Internet of Workflows. IEEE Internet Computing 25, 5 (Aug. 2021), 26--34. https://doi.org/10.1109/MIC.2021.3103236
[21]
Jorge Ejarque, Rosa M. Badia, Loïc Albertin, Giovanni Aloisio, Enrico Baglione, Yolanda Becerra, Stefan Boschert, Julian R. Berlin, Alessandro D'Anca, Donatello Elia, et al. 2022. Enabling dynamic and intelligent workflows for HPC, data analytics, and AI convergence. Future generation computer systems 134 (Sept. 2022), 414--429. https://doi.org/10.1016/j.future.2022.04.014
[22]
Donatello Elia, Sandro Fiore, and Giovanni Aloisio. 2021. Towards HPC and Big Data Analytics Convergence: Design and Experimental Evaluation of a HPDA Framework for eScience at Scale. IEEE Access 9 (May 2021), 73307--73326. https://doi.org/10.1109/ACCESS.2021.3079139
[23]
Kayvon Fatahalian, Jeremy Sugerman, and Pat Hanrahan. 2004. Understanding the Efficiency of GPU Algorithms for Matrix-Matrix Multiplication. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS conference on Graphics hardware (Grenoble, France) (HWWS '04). Association for Computing Machinery (ACM), New York, NY, USA, 133--137. https://doi.org/10.1145/1058129.1058148
[24]
Marcel Flottmann, Marc Eisoldt, Julian Gaal, Marc Rothmann, Marco Tassemeier, Thomas Wiemann, and Mario Porrmann. 2021. Energy-efficient FPGA-accelerated LiDAR-based SLAM for embedded robotics. In Proceedings of the 2021 International Conference on Field-Programmable Technology (Auckland, New Zealand) (ICFPT '21). IEEE, New York, NY, USA, 1--6. https://doi.org/10.1109/ICFPT52863.2021.9609934
[25]
Eitan Frachtenberg. 2021. Experience and Practice Teaching an Undergraduate Course on Diverse Heterogeneous Architectures. In Proceedings of the 2021 IEEE/ACM Ninth Workshop on Education for High Performance Computing (St. Louis, MO, USA) (EduHPC '21). IEEE, New York, NY, USA, 1--8. https://doi.org/10.1109/EduHPC54835.2021.00006
[26]
Trevor Gale, Matei Zaharia, Cliff Young, and Erich Elsen. 2020. Sparse GPU Kernels for Deep Learning. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (Atlanta, GA, USA) (SC '20). IEEE, New York, NY, USA, 1--14. https://doi.org/10.1109/SC41405.2020.00021
[27]
Rajesh Gandham, Yongpeng Zhang, Kenneth Esler, and Vincent Natoli. 2021. Improving GPU throughput of reservoir simulations using NVIDIA MPS and MIG. In Proceedings of the Fifth EAGE Workshop on High Performance Computing for Upstream (Online). European Association of Geoscientists & Engineers, Houten, The Netherlands, 1--5. https://doi.org/10.3997/2214-4609.2021612025
[28]
Andreas Gerstmayr, Ken McDonell, Lukas Berk, Mark Goodwin, Marko Myllynen, and Nathan Scott. 2022. Performance Co-Pilot. Red Hat, Inc. Retrieved October 1, 2022 from https://pcp.io/
[29]
Nicholas Gordon, Kevin Pedretti, and John R. Lange. 2022. Porting the Kitten Lightweight Kernel Operating System to RISC-V. In Proceedings of the International Workshop on Runtime and Operating Systems for Supercomputers (Dallas, TX, USA) (ROSS '22). IEEE, New York, NY, USA, 1--7. https://doi.org/10.1109/ROSS56639.2022.00008
[30]
Jashwant Raj Gunasekaran, Prashanth Thinakaran, Nachiappan Chidambaram, Mahmut T. Kandemir, and Chita R. Das. 2020. Fifer: Tackling Underutilization in the Serverless Era. (Aug. 2020). arXiv:2008.12819
[31]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (Las Vegas, NV, USA) (CVPR 2016). IEEE, New York, NY, USA, 770--778. https://doi.org/10.1109/CVPR.2016.90
[32]
Scott Hendrickson, Stephen Sturdevant, Tyler Harter, Venkateshwaran Venkataramani, Andrea C. Arpaci-Dusseau, and Remzi H. Arpaci-Dusseau. 2016. Serverless Computation with OpenLambda. In Proceedings of the 8th USENIX Workshop on Hot Topics in Cloud Computing (Denver, CO, USA) (HotCloud '16). USENIX Association, Berkeley, CA, USA.
[33]
Hewlett Packard Enterprise. 2020. Enabling GPU as a Service -- A Cloud-Like Experience for GPU Infrastructure using Containers (Solution Brief). Retrieved September 11, 2023 from https://www.hpe.com/psnow/doc/a00075067enw
[34]
Anahita Hosseinkhani and Behnam Ghavami. 2021. Improving Soft Error Reliability of FPGA-based Deep Neural Networks with Reduced Approximate TMR. In Proceedings of the 2021 11th International Conference on Computer Engineering and Knowledge (Mashhad, Iran) (ICCKE '21). IEEE, New York, NY, USA, 459--464. https://doi.org/10.1109/ICCKE54056.2021.9721442
[35]
Sitao Huang, Kun Wu, Hyunmin Jeong, Chengyue Wang, Deming Chen, and Wen-Mei Hwu. 2021. Pylog: An algorithm-centric python-based FPGA programming and synthesis flow. IEEE Trans. Comput. 70, 12 (Oct. 2021), 2015--2028. https://doi.org/10.1109/TC.2021.3123465
[36]
IBM Quantum. 2021. IBM Quantum Processor Types. Retrieved May 24, 2023 from https://quantum-computing.ibm.com/services/resources/docs/resources/manage/systems/processors
[37]
IBM Quantum. 2022. Qiskit. Retrieved December 2, 2022 from https://qiskit.org/
[38]
Al Amjad Tawfiq Isstaif and Richard Mortier. 2023. Towards Latency-Aware Linux Scheduling for Serverless Workloads. In Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies (Rome, Italy) (SESAME '23). Association for Computing Machinery (ACM), New York, NY, USA, 19--26. https://doi.org/10.1145/3592533.3592807
[39]
Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, and Fan Yang. 2019. Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads. In Proceedings of the 2019 USENIX Annual Technical Conference (Renton, WA, USA) (ATC '19). USENIX Association, Berkeley, CA, USA, 947--960.
[40]
Fauzi Mohd Johar, Farah Ayuni Azmin, Mohamad Kadim Suaidi, Abdul Samad Shibghatullah, Badrul Hisham Ahmad, Siti Nadzirah Salleh, Mohamad Zoinol Abidin Abd Aziz, and Mahfuzah Md Shukor. 2013. A review of genetic algorithms and parallel genetic algorithms on graphics processing unit (GPU). In Proceedings of the 2013 International Conference on Control System, Computing and Engineering (Penang, Malaysia) (ICCSCE '13). IEEE, New York, NY, USA, 264--269. https://doi.org/10.1109/ICCSCE.2013.6719971
[41]
Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-Che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Yadwadkar, et al. 2019. Cloud Programming Simplified: A Berkeley View on Serverless Computing. Technical Report UCB/EECS-2019-3. EECS Department, University of California, Berkeley, Berkeley, CA, USA. https://www2.eecs.berkeley.edu/Pubs/TechRpts/2019/EECS-2019-3.html
[42]
Norman P. Jouppi, Doe Hyun Yoon, George Kurian, Sheng Li, Nishant Patil, James Laudon, Cliff Young, and David Patterson. 2020. A Domain-Specific Supercomputer for Training Deep Neural Networks. Commun. ACM 63, 7 (June 2020), 67--78. https://doi.org/10.1145/3360307
[43]
Hamidreza Khaleghzadeh, Ziming Zhong, Ravi Reddy, and Alexey Lastovetsky. 2017. Out-of-core implementation for accelerator kernels on heterogeneous clouds. The Journal of Supercomputing 74, 2 (Sept. 2017), 551--568. https://doi.org/10.1007/s11227-017-2141-4
[44]
Dario Korolija, Timothy Roscoe, and Gustavo Alonso. 2020. Do OS abstractions make sense on FPGAs?. In Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation (Online) (OSDI '20). USENIX Association, Berkeley, CA, USA, 991--1010.
[45]
Jörn Kuhlenkamp, Sebastian Werner, Maria C. Borges, Dominik Ernst, and Daniel Wenzel. 2020. Benchmarking Elasticity of FaaS Platforms as a Foundation for Objective-driven Design of Serverless Applications. In Proceedings of the 35th Annual ACM Symposium on Applied Computing (Brno, Czech Republic) (SAC '20). Association for Computing Machinery (ACM), New York, NY, USA, 1576--1585. https://doi.org/10.1145/3341105.3373948
[46]
Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. 2015. Numba: A LLVM-Based Python JIT Compiler. In Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC (Austin, TX, USA) (LLVM '15). Association for Computing Machinery (ACM), New York, NY, USA, 1--6. https://doi.org/10.1145/2833157.2833162
[47]
Baolin Li, Tirthak Patel, Siddarth Samsi, Vijay Gadepally, and Devesh Tiwari. 2022. Using Multi-Instance GPU for Efficient Operation of Multi-Tenant GPU Clusters. (July 2022). arXiv:2207.11428
[48]
Junfeng Li, Sameer G. Kulkarni, K. K. Ramakrishnan, and Dan Li. 2019. Understanding Open Source Serverless Platforms: Design Considerations and Performance. In Proceedings of the 5th International Workshop on Serverless Computing (Davis, CA, USA) (WoSC '19). Association for Computing Machinery (ACM), New York, NY, USA, 37--42. https://doi.org/10.1145/3366623.3368139
[49]
Teng Li, Vikram K. Narayana, Esam El-Araby, and Tarek El-Ghazawi. 2011. GPU Resource Sharing and Virtualization on High Performance Computing Systems. In Proceedings of the 2011 International Conference on Parallel Processing (Taipei, Taiwan) (ICPP '11). IEEE, New York, NY, USA, 733--742. https://doi.org/10.1109/ICPP.2011.88
[50]
Fabio Maschi, Dario Korolija, and Gustavo Alonso. 2023. Serverless FPGA: Work-In-Progress. In Proceedings of the 1st Workshop on SErverless Systems, Applications and MEthodologies (Rome, Italy) (SESAME '23). Association for Computing Machinery (ACM), New York, NY, USA, 1--4. https://doi.org/10.1145/3592533.3592804
[51]
Anil Mathew, Vasilios Andrikopoulos, and Frank J. Blaauw. 2021. Exploring the cost and performance benefits of AWS Step Functions using a data processing pipeline. In Proceedings of the 14th IEEE/ACM International Conference on Utility and Cloud Computing (Leicester, United Kingdom) (UCC '21). Association for Computing Machinery (ACM), New York, NY, USA, 1--10. https://doi.org/10.1145/3468737.3494084
[52]
Dejan Milojicic, Paolo Faraboschi, Nicolas Dube, and Duncan Roweth. 2021. Future of HPC: Diversifying Heterogeneity. In Proceedings of the 2021 Design, Automation & Test in Europe Conference & Exhibition (Grenoble, France) (DATE '21). IEEE, New York, NY, USA, 276--281. https://doi.org/10.23919/DATE51398.2021.9474063
[53]
Diana M. Naranjo, Sebastián Risco, Carlos de Alfonso, Alfonso Pérez, Ignacio Blanquer, and Germán Moltó. 2020. Accelerated serverless computing based on GPU virtualization. J. Parallel and Distrib. Comput. 139 (May 2020), 32--42. https://doi.org/10.1016/j.jpdc.2020.01.004
[54]
Anna Maria Nestorov, Josep Lluís Berral, Claudia Misale, Chen Wang, David Carrera, and Alaa Youssef. 2022. Floki: A Proactive Data Forwarding System for Direct Inter-Function Communication for Serverless Workflows. In Proceedings of the Eighth International Workshop on Container Technologies and Container Clouds (Quebec City, QC, Canada) (WoC '22). Association for Computing Machinery (ACM), New York, NY, USA, 13--18. https://doi.org/10.1145/3565384.3565890
[55]
Sam Newman. 2015. Building Microservices. O'Reilly Media, Inc., Sebastopol, CA, USA.
[56]
Kim Nguyen and Sam Chung. 2021. Low Maintenance, Low Cost, Highly Secure, and Highly Manageable Serverless Solutions for Software Reverse Engineering. In Proceedings of the Conference on Information Systems Applied Research (Washington, DC, USA) (CONISAR '21). Information Systems and Computing Academic Professionals, 1--10.
[57]
Kyndylan Nienhuis, Alexandre Joannou, Thomas Bauereiss, Anthony Fox, Michael Roe, Brian Campbell, Matthew Naylor, Robert M. Norton, Simon W. Moore, Peter G. Neumann, Ian Stark, Robert N. M. Watson, and Peter Sewell. 2020. Rigorous engineering for hardware security: Formal modelling and proof in the CHERI design and implementation process. In Proceedings of the 2020 IEEE Symposium on Security and Privacy (San Francisco, CA, USA) (SP '20). IEEE, New York, NY, USA, 1003--1020. https://doi.org/10.1109/SP40000.2020.00055
[58]
NVIDIA. 2023. Multi-Process Service. Retrieved May 25, 2023 from https://docs.nvidia.com/deploy/mps/index.html
[59]
NVIDIA. 2023. NVIDIA Multi-Instance GPU. Retrieved May 25, 2023 from https://www.nvidia.com/en-us/technologies/multi-instance-gpu/
[60]
Jacob Pan. 2013. RAPL (Running Average Power Limit) driver. Intel Corporation. Retrieved December 2, 2022 from https://lwn.net/Articles/545745/
[61]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An Imperative Style, High-Performance Deep Learning Library. In Advances in Neural Information Processing Systems 32. Neural Information Processing Systems Foundation, 8024--8035.
[62]
Nathan Pemberton. 2022. The Serverless Datacenter: Hardware and Software Techniques for Resource Disaggregation. Ph.D. Dissertation. University of California, Berkeley, Berkeley, CA, USA. Advisor(s) Randy Katz.
[63]
Nathan Pemberton and Johann Schleier-Smith. 2019. The Serverless Data Center: Hardware Disaggregation Meets Serverless Computing. In Proceedings of the First Workshop on Resource Disaggregation (Providence, RI, USA) (WORD '19).
[64]
Nathan Pemberton, Anton Zabreyko, Zhoujie Ding, Randy Katz, and Joseph Gonzalez. 2022. Kernel-as-a-Service: A Serverless Interface to GPUs. (Dec. 2022). arXiv:2212.08146
[65]
Alberto Peruzzo, Jarrod McClean, Peter Shadbolt, Man-Hong Yung, Xiao-Qi Zhou, Peter J. Love, Alán Aspuru-Guzik, and Jeremy L. O'brien. 2014. A variational eigenvalue solver on a photonic quantum processor. Nature communications 5, 1, Article 4213 (July 2014), 7 pages. https://doi.org/10.1038/ncomms5213
[66]
Murad Qasaimeh, Kristof Denolf, Jack Lo, Kees Vissers, Joseph Zambreno, and Phillip H. Jones. 2019. Comparing energy efficiency of CPU, GPU and FPGA implementations for vision kernels. In Proceedings of the International 2019 IEEE International Conference on Embedded Software and Systems (Las Vegas, NV, USA) (ICESS '19). IEEE, New York, NY, USA, 1--8. https://doi.org/10.1109/ICESS.2019.8782524
[67]
Shixiong Qi, Leslie Monis, Ziteng Zeng, Ian-chin Wang, and K. K. Ramakrishnan. 2022. SPRIGHT: Extracting the Server from Serverless Computing! High-Performance EBPF-Based Event-Driven, Shared-Memory Processing. In Proceedings of the ACM SIGCOMM 2022 Conference (Amsterdam, Netherlands) (SIGCOMM '22). Association for Computing Machinery (ACM), New York, NY, USA, 780--794. https://doi.org/10.1145/3544216.3544259
[68]
Issam Raïs, Anne-Cécile Orgerie, and Martin Quinson. 2016. Impact of Shutdown Techniques for Energy-Efficient Cloud Data Centers. In Proceedings of the International Conference on Algorithms and Architectures for Parallel Processing (Granada, Spain) (ICA3PP '16). Springer, Heidelberg, Germany, 203--210. https://doi.org/10.1007/978-3-319-49583-5_15
[69]
Gourav Rattihalli, Ninad Hogade, Aditya Dhakal, Eitan Frachtenberg, Rolando Pablo Hong Enriquez, Pedro Bruel, Alok Mishra, and Dejan Milojicic. 2023. Fine-Grained Heterogeneous Execution Framework with Energy Aware Scheduling. In Proceedings of the 2023 IEEE 16th International Conference on Cloud Computing (Chicago, IL, USA) (CLOUD '23). IEEE, New York, NY, USA, 35--44. https://doi.org/10.1109/CLOUD60044.2023.00014
[70]
Sebastián Risco and Germán Moltó. 2021. GPU-Enabled Serverless Workflows for Efficient Multimedia Processing. Journal of Applied Sciences 11, 4 (Feb. 2021), 1438. https://doi.org/10.3390/app11041438
[71]
Felix Ritter, Tobias Boskamp, A. Homeyer, Hendrik Laue, Michael Schwier, Florian Link, and H.-O. Peitgen. 2011. Medical Image Analysis. IEEE Pulse 2, 6 (Dec. 2011), 60--70. https://doi.org/10.1109/MPUL.2011.942929
[72]
Andrea Sabbioni, Lorenzo Rosa, Armir Bujari, Luca Foschini, and Antonio Corradi. 2021. A Shared Memory Approach for Function Chaining in Serverless Platforms. In Proceedings of the 2021 IEEE Symposium on Computers and Communications (Athens, Greece) (ISCC '21). IEEE, New York, NY, USA, 1--6. https://doi.org/10.1109/ISCC53001.2021.9631385
[73]
Marc Sánchez-Artigas and Germán T. Eizaguirre. 2022. A Seer Knows Best: Optimized Object Storage Shuffling for Serverless Analytics. In Proceedings of the 23rd ACM/IFIP International Middleware Conference (Quebec City, QC, Canada) (Middleware '22). Association for Computing Machinery (ACM), New York, NY, USA, 148--160. https://doi.org/10.1145/3528535.3565241
[74]
Trever Schirmer, Joel Scheuner, Tobias Pfandzelter, and David Bermbach. 2022. Fusionize: Improving Serverless Application Performance through Feedback-Driven Function Fusion. In Proceedings of the 10th IEEE International Conference on Cloud Engineering (Asilomar, CA, USA) (IC2E 2022). IEEE, New York, NY, USA, 85--95. https://doi.org/10.1109/IC2E55432.2022.00017
[75]
Hossein Shafiei, Ahmad Khonsari, and Payam Mousavi. 2022. Serverless Computing: A Survey of Opportunities, Challenges, and Applications. Comput. Surveys 54, 11s (Jan. 2022), 1--32. https://doi.org/10.1145/3510611
[76]
Mohammad Shahrad, Rodrigo Fonseca, Íñigo Goiri, Gohar Chaudhry, Paul Batum, Jason Cooke, Eduardo Laureano, Colby Tresness, Mark Russinovich, and Ricardo Bianchini. 2020. Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider. In Proceedings of the 2020 USENIX Annual Technical Conference (Virtual Event, USA) (ATC '20). USENIX Association, Berkeley, CA, USA, 205--218.
[77]
John Shalf. 2020. The future of computing beyond Moore's Law. Philosophical Transactions of the Royal Society A 378, 2166 (Jan. 2020), 20190061. https://doi.org/10.1098/rsta.2019.0061
[78]
Prateek Sharma. 2022. Challenges and Opportunities in Sustainable Serverless Computing. In Proceedings of the 1st Workshop on Sustainable Computer Systems Design and Implementation (La Jolla, CA, USA) (HotCarbon '22). USENIX Association, Berkeley, CA, USA.
[79]
Sushant Sharma, Chung-Hsing Hsu, and Wu-chun Feng. 2006. Making a Case for a Green500 List. In Proceedings of the Proceedings 20th IEEE International Parallel & Distributed Processing Symposium (Rhodes, Greece) (IPDPS '06). IEEE, New York, NY, USA. https://doi.org/10.1109/IPDPS.2006.1639600
[80]
Prasoon Sinha, Akhil Guliani, Rutwik Jain, Brandon Tran, Matthew D. Sinclair, and Shivaram Venkataraman. 2022. Not All GPUs Are Created Equal: Characterizing Variability in Large-Scale, Accelerator-Rich Systems. In Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis (Dallas, TX, USA) (SC '22). IEEE, New York, NY, USA, 1--15. https://doi.org/10.1109/SC41404.2022.00070
[81]
Sebastian Thrun. 2007. Simultaneous localization and mapping. In Robotics and cognitive approaches to spatial mapping. Springer, 13--41.
[82]
Paramita Basak Upama, Md Jobair Hossain Faruk, Mohammad Nazim, Mohammad Masum, Hossain Shahriar, Gias Uddin, Shabir Barzanjeh, Sheikh Iqbal Ahamed, and Akond Rahman. 2022. Evolution of Quantum Computing: A Systematic Survey on the Use of Quantum Computing Tools. In Proceedings of the 46th Annual Computers, Software, and Applications Conference (Virtual Event, USA) (COMPSAC '22). IEEE, New York, NY, USA, 520--529. https://doi.org/10.1109/COMPSAC54236.2022.00096
[83]
Ava Vali, Sara Comai, and Matteo Matteucci. 2020. Deep Learning for Land Use and Land Cover Classification Based on Hyperspectral and Multispectral Earth Observation Data: A Review. Remote Sensing 12, 15 (Aug. 2020), 2495. https://doi.org/10.3390/rs12152495
[84]
Blesson Varghese and Rajkumar Buyya. 2018. Next generation cloud computing: New trends and research directions. Future Generation Computer Systems 79 (Feb. 2018), 849--861. https://doi.org/10.1016/j.future.2017.09.020
[85]
Ao Wang, Shuai Chang, Huangshi Tian, Hongqi Wang, Haoran Yang, Huiba Li, Rui Du, and Yue Cheng. 2021. FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute. In Proceedings of the 2021 USENIX Annual Technical Conference (Virtual Event, USA) (ATC '21). USENIX Association, Berkeley, CA, USA, 443--457.
[86]
Minjie Wang, Da Zheng, Zihao Ye, Quan Gan, Mufei Li, Xiang Song, Jinjing Zhou, Chao Ma, Lingfan Yu, Yu Gai, Tianjun Xiao, Tong He, George Karypis, Jinyang Li, and Zheng Zhang. 2019. Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks. (Sept. 2019). arXiv:1909.01315
[87]
Zhenning Wang, Jun Yang, Rami Melhem, Bruce Childers, Youtao Zhang, and Minyi Guo. 2015. Simultaneous Multikernel: Fine-Grained Sharing of GPUs. IEEE Computer Architecture Letters 15, 2 (Sept. 2015), 113--116. https://doi.org/10.1109/LCA.2015.2477405
[88]
Logan Ward, Ganesh Sivaraman, J. Gregory Pauloski, Yadu Babuji, Ryan Chard, Naveen Dandu, Paul C. Redfern, Rajeev S. Assary, Kyle Chard, Larry A. Curtiss, Rajeev Thakur, and Ian Foster. 2021. Colmena: Scalable Machine-Learning-Based Steering of Ensemble Simulations for High Performance Computing. In Proceedings of the 2021 IEEE/ACM Workshop on Machine Learning in High Performance Computing Environments (St. Louis, MO, USA) (MLHPC '21). IEEE, New York, NY, USA, 9--20. https://doi.org/10.1109/MLHPC54614.2021.00007
[89]
Stefan Weinzierl. 2000. Introduction to Monte Carlo methods. (June 2000). arXiv:hep-ph/0006269
[90]
Sebastian Werner and Trever Schirmer. 2022. Hardless: A Generalized Serverless Compute Architecture for Hardware Processing Accelerators. In Proceedings of the 10th IEEE International Conference on Cloud Engineering (Asilomar, CA, USA) (IC2E 2022). IEEE, New York, NY, USA, 79--84. https://doi.org/10.1109/IC2E55432.2022.00016
[91]
Robert Wille, Rod Van Meter, and Yehuda Naveh. 2019. IBM's Qiskit tool chain: Working with and developing for real quantum computers. In Proceedings of the 2019 Design, Automation & Test in Europe Conference & Exhibition (Florence, Italy) (DATE '19). IEEE, New York, NY, USA, 1234--1240. https://doi.org/10.23919/DATE.2019.8715261
[92]
Bo Wu, Xu Liu, Xiaobo Zhou, and Changjun Jiang. 2017. FLEP: Enabling Flexible and Efficient Preemption on GPUs. ACM SIGPLAN Notices 52, 4 (April 2017), 483--496. https://doi.org/10.1145/3093336.3037742
[93]
Tsung Tai Yeh, Amit Sabne, Putt Sakdhnagool, Rudolf Eigenmann, and Timothy G. Rogers. 2017. Pagoda: Fine-Grained GPU Resource Virtualization for Narrow Tasks. ACM SIGPLAN Notices 52, 8 (Aug. 2017), 221--234. https://doi.org/10.1145/3155284.3018754
[94]
Mohamed Zahran. 2016. Heterogeneous Computing: Here to Stay: Hardware and Software Perspectives. Queue 14, 6 (Nov. 2016), 31--42. https://doi.org/10.1145/3028687.3038873
[95]
Yue Zha and Jing Li. 2020. Virtualizing FPGAs in the Cloud. In Proceedings of the Twenty-Fifth International Conference on Architectural Support for Programming Languages and Operating Systems (Lausanne, Switzerland) (ASPLOS '20). Association for Computing Machinery (ACM), New York, NY, USA, 845--858. https://doi.org/10.1145/3373376.3378491
[96]
Peng Zhang, Jianbin Fang, Canqun Yang, Chun Huang, Tao Tang, and Zheng Wang. 2020. Optimizing Streaming Parallelism on Heterogeneous Many-Core Architectures. IEEE Transactions on Parallel and Distributed Systems 31, 8 (March 2020), 1878--1896. https://doi.org/10.1109/TPDS.2020.2978045
[97]
Wei Zhang, Quan Chen, Ningxin Zheng, Weihao Cui, Kaihua Fu, and Minyi Guo. 2021. Toward QoS-Awareness and Improved Utilization of Spatial Multitasking GPUs. IEEE Trans. Comput. 71, 4 (March 2021), 866--879. https://doi.org/10.1109/TC.2021.3064352
[98]
Chen Zhao, Wu Gao, Feiping Nie, and Huiyang Zhou. 2021. A Survey of GPU Multitasking Methods Supported by Hardware Architecture. Transactions on Parallel and Distributed Systems 33, 6 (Sept. 2021), 1451--1463. https://doi.org/10.1109/TPDS.2021.3115630
[99]
Haidong Zhao, Zakaria Benomar, Tobias Pfandzelter, and Nikolaos Georgantas. 2022. Supporting Multi-Cloud in Serverless Computing. In Proceedings of the 15th IEEE/ACM International Conference on Utility and Cloud Computing Companion (Vancouver, WA, USA) (UCC '22). IEEE, New York, NY, USA, 285--290. https://doi.org/10.1109/UCC56403.2022.00051
[100]
Laiping Zhao, Yanan Yang, Yiming Li, Xian Zhou, and Keqiu Li. 2021. Understanding, Predicting and Scheduling Serverless Workloads under Partial Interference. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (St. Louis, MO, USA) (SC '21). Association for Computing Machinery (ACM), New York, NY, USA, 1--15. https://doi.org/10.1145/3458817.3476215

Cited By

View all
  • (2024)Enabling HPC Scientific Workflows for ServerlessProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00022(110-125)Online publication date: 17-Nov-2024
  • (2024)Predicting Heterogeneity and Serverless Principles of Converged High-Performance Computing, Artificial Intelligence, and WorkflowsComputer10.1109/MC.2023.333297357:1(136-144)Online publication date: 3-Jan-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '23: Proceedings of the 24th International Middleware Conference
November 2023
334 pages
ISBN:9798400701771
DOI:10.1145/3590140
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IFIP: International Federation for Information Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 November 2023
Accepted: 13 October 2023
Revised: 02 June 2023
Received: 02 December 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Accelerators
  2. Heterogeneity
  3. Serverless

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

Middleware '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)329
  • Downloads (Last 6 weeks)14
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Enabling HPC Scientific Workflows for ServerlessProceedings of the SC '24 Workshops of the International Conference on High Performance Computing, Network, Storage, and Analysis10.1109/SCW63240.2024.00022(110-125)Online publication date: 17-Nov-2024
  • (2024)Predicting Heterogeneity and Serverless Principles of Converged High-Performance Computing, Artificial Intelligence, and WorkflowsComputer10.1109/MC.2023.333297357:1(136-144)Online publication date: 3-Jan-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media