research-article

Exploiting HLS-Generated Multi-Version Kernels to Improve CPU-FPGA Cloud Systems

Authors:

Bernardo Neuhaus Lignati,

Michael Guilherme Jordan,

Guilherme Korol,

Mateus Beck Rutzig,

Antonio Carlos Schneider BeckAuthors Info & Claims

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

Pages 536 - 541

https://doi.org/10.1145/3394885.3431557

Published: 29 January 2021 Publication History

Abstract

Cloud Warehouses have been exploiting CPU-FPGA collaborative execution environments, where multiple clients share the same infrastructure to achieve to maximize resource utilization with the highest possible energy efficiency and scalability. However, the resource provisioning is challenging in these environments, since kernels may be dispatched to both CPU and FPGA concurrently in a highly variant scenario, in terms of available resources and workload characteristics. In this work, we propose MultiVers, a framework that leverages automatic HLS generation to enable further gains in such CPU-FPGA collaborative systems. MultiVers exploits the automatic generation from HLS to build libraries containing multiple versions of each incoming kernel request, greatly enlarging the available design space exploration passive of optimization by the allocation strategies in the cloud provider. Multivers makes both kernel multiversioning and allocation strategy to work symbiotically, allowing fine-tuning in terms of resource usage, performance, energy, or any combination of these parameters. We show the efficiency of MultiVers by using real-world cloud request scenarios with a diversity of benchmarks, achieving average improvements on makespan and energy of up to 4.62x and 19.04x, respectively, over traditional allocation strategies executing non-optimized kernels.

References

[1]

EC Amazon. 2015. Amazon web services. Available in: http://aws.amazon.com/es/ec2/ (November 2012) (2015).

[2]

Antonio Carlos Schneider Beck Fl and Luigi Carro. 2010. Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution. Springer Science & Business Media.

[3]

David E Goldberg. 2008. Genetic algorithms in search, optimization & machine learning, third impression.

[4]

Ye Hu, Johnny Wong, Gabriel Iszlai, and Marin Litoiu. 2009. Resource provisioning for cloud computing. In CASCON. 101--111.

[5]

Sitao Huang, Li-Wen Chang, Izzat El Hajj, Simon Garcia de Gonzalo, Juan Gómez-Luna, Sai Rahul Chalamalasetti, Mohamed El-Hadedy, Dejan Milojicic, Onur Mutlu, Deming Chen, et al. 2019. Analysis and modeling of collaborative execution strategies for heterogeneous CPU-FPGA architectures. In ICPE. ACM, 79--90.

[6]

Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Multidimensional knapsack problems. In Knapsack problems. Springer, 235--283.

[7]

Pham Nam Khanh, Amit Kumar Singh, Akash Kumar, and Khin Mi Mi Aung. 2015. Exploiting loop-array dependencies to accelerate the design space exploration with high level synthesis. In DATE. 157--162.

[8]

Junyi Liu, Samuel Bayliss, and George A Constantinides. 2015. Offline synthesis of online dependence testing: Parametric loop pipelining for HLS. In FCCM. IEEE, 159--162.

[9]

Louis-Noel Pouchet. 2019. PolyBench/C: the Polyhedral Benchmark suite. http://web.cse.ohio-state.edu/ pouchet.2/software/polybench/.

[10]

Aditya Marphatia, Aditi Muhnot, Tanveer Sachdeva, Esha Shukla, and Lakshmi Kurup. 2013. Optimization of FCFS based resource provisioning algorithm for cloud computing. IOSR-JCE 10, 5 (2013), 1--5.

[11]

Tuan DA Nguyen and Akash Kumar. 2020. Maximizing the Serviceability of Partially Reconfigurable FPGA Systems in Multi-tenant Environment. In FPGA. 29--39.

[12]

Ronald Rivest. 1992. RFC1321: The MD5 message-digest algorithm.

[13]

Martín Safe, Jessica Carballido, Ignacio Ponzoni, and Nélida Brignole. 2004. On stopping criteria for genetic algorithms. In SBIA. Springer, 405--413.

[14]

Junnan Shan, Mario R Casu, Jordi Cortadella, Luciano Lavagno, and Mihai T Lazarescu. 2019. Exact and heuristic allocation of multi-kernel applications to multi-FPGA platforms. In DAC. 1--6.

[15]

John E Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering 12, 3 (2010), 66.

Digital Library

[16]

N Susila and S Chandramathi. 2016. Energy efficient extended fcfs load balancing in data centers of cloud. IJAER 11, 1 (2016), 599--605.

[17]

Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, and Jason Cong. 2017. Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems. In ASP-DAC. IEEE, 488--493.

[18]

Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, and Bingsheng He. 2019. Performance modeling and directives optimization for high level synthesis on fpga. IEEE TCAD (2019).

Digital Library

[19]

Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, and Smail Niar. 2016. Lin-analyzer: a high-level performance analysis tool for FPGA-based accelerators. In (DAC). IEEE, 1--6.

[20]

Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, and Smail Niar. 2017. Design Space exploration of FPGA-based accelerators with multi-level parallelism. In (DATE). IEEE, 1141--1146.

Cited By

Korol GJordan MRutzig MBeck A(2022)AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774727(244-249)Online publication date: 14-Mar-2022
https://doi.org/10.23919/DATE54114.2022.9774727
Knorst TKorol GJordan MVicenzi JLorenzon ARutzig MBeck A(2022)On the benefits of Collaborative Thread Throttling and HLS-Versioning in CPU-FPGA Environments2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)10.1109/SBCCI55532.2022.9893223(1-6)Online publication date: 22-Aug-2022
https://doi.org/10.1109/SBCCI55532.2022.9893223
Knorst TJordan MLorenzon ARutzig MBeck A(2021)ETCF – Energy-Aware CPU Thread Throttling and Workload Balancing Framework for CPU-FPGA Collaborative Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628345(1-8)Online publication date: 22-Nov-2021
https://doi.org/10.1109/SBESC53686.2021.9628345
Show More Cited By

Recommendations

MVSym: Efficient symbiotic exploitation of HLS-kernel multi-versioning for collaborative CPU-FPGA cloud systems
Abstract
Cloud Warehouses have been exploiting CPU-FPGA collaborative environments, where clients share the same infrastructure to maximize resource utilization with energy efficiency. In this scope, resource provisioning is challenging as ...
Energy-aware fully-adaptive resource provisioning in collaborative CPU-FPGA cloud environments
Abstract
Cloud warehouses have been exploiting multi-tenancy in CPU-FPGA collaborative environments, so clients can share the same infrastructure, achieving scalability and maximizing resource utilization. Therefore, the distribution of tasks ...
Highlights
- We explore voltage/frequency scaling and provisioning in multi-tenant CPU-FPGA Cloud.
Load balancing in cloud computing: A big picture
Abstract
Scheduling or the allocation of user requests (tasks) in the cloud environment is an NP-hard optimization problem. According to the cloud infrastructure and the user requests, the cloud system is assigned with some load (that may be ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

January 2021

930 pages

ISBN:9781450379991

DOI:10.1145/3394885

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGDA: ACM Special Interest Group on Design Automation
IEEE CAS
IEEE CEDA

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 January 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ASPDAC '21

Sponsor:

SIGDA

ASPDAC '21: 26th Asia and South Pacific Design Automation Conference

January 18 - 21, 2021

Tokyo, Japan

Acceptance Rates

ASPDAC '21 Paper Acceptance Rate 111 of 368 submissions, 30%;

Overall Acceptance Rate 466 of 1,454 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
94
Total Downloads

Downloads (Last 12 months)8
Downloads (Last 6 weeks)2

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Korol GJordan MRutzig MBeck A(2022)AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774727(244-249)Online publication date: 14-Mar-2022
https://doi.org/10.23919/DATE54114.2022.9774727
Knorst TKorol GJordan MVicenzi JLorenzon ARutzig MBeck A(2022)On the benefits of Collaborative Thread Throttling and HLS-Versioning in CPU-FPGA Environments2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)10.1109/SBCCI55532.2022.9893223(1-6)Online publication date: 22-Aug-2022
https://doi.org/10.1109/SBCCI55532.2022.9893223
Knorst TJordan MLorenzon ARutzig MBeck A(2021)ETCF – Energy-Aware CPU Thread Throttling and Workload Balancing Framework for CPU-FPGA Collaborative Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628345(1-8)Online publication date: 22-Nov-2021
https://doi.org/10.1109/SBESC53686.2021.9628345
Vicenzi JKnorst TJordan MKorol GBeck ARutzig M(2021)TRIPP: Transparent Resource Provisioning for Multi-Tenant CPU-GPU based Cloud Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628223(1-8)Online publication date: 22-Nov-2021
https://doi.org/10.1109/SBESC53686.2021.9628223

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten