skip to main content
10.1145/3394885.3431557acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

Exploiting HLS-Generated Multi-Version Kernels to Improve CPU-FPGA Cloud Systems

Published: 29 January 2021 Publication History

Abstract

Cloud Warehouses have been exploiting CPU-FPGA collaborative execution environments, where multiple clients share the same infrastructure to achieve to maximize resource utilization with the highest possible energy efficiency and scalability. However, the resource provisioning is challenging in these environments, since kernels may be dispatched to both CPU and FPGA concurrently in a highly variant scenario, in terms of available resources and workload characteristics. In this work, we propose MultiVers, a framework that leverages automatic HLS generation to enable further gains in such CPU-FPGA collaborative systems. MultiVers exploits the automatic generation from HLS to build libraries containing multiple versions of each incoming kernel request, greatly enlarging the available design space exploration passive of optimization by the allocation strategies in the cloud provider. Multivers makes both kernel multiversioning and allocation strategy to work symbiotically, allowing fine-tuning in terms of resource usage, performance, energy, or any combination of these parameters. We show the efficiency of MultiVers by using real-world cloud request scenarios with a diversity of benchmarks, achieving average improvements on makespan and energy of up to 4.62x and 19.04x, respectively, over traditional allocation strategies executing non-optimized kernels.

References

[1]
EC Amazon. 2015. Amazon web services. Available in: http://aws.amazon.com/es/ec2/ (November 2012) (2015).
[2]
Antonio Carlos Schneider Beck Fl and Luigi Carro. 2010. Dynamic Reconfigurable Architectures and Transparent Optimization Techniques: Automatic Acceleration of Software Execution. Springer Science & Business Media.
[3]
David E Goldberg. 2008. Genetic algorithms in search, optimization & machine learning, third impression.
[4]
Ye Hu, Johnny Wong, Gabriel Iszlai, and Marin Litoiu. 2009. Resource provisioning for cloud computing. In CASCON. 101--111.
[5]
Sitao Huang, Li-Wen Chang, Izzat El Hajj, Simon Garcia de Gonzalo, Juan Gómez-Luna, Sai Rahul Chalamalasetti, Mohamed El-Hadedy, Dejan Milojicic, Onur Mutlu, Deming Chen, et al. 2019. Analysis and modeling of collaborative execution strategies for heterogeneous CPU-FPGA architectures. In ICPE. ACM, 79--90.
[6]
Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Multidimensional knapsack problems. In Knapsack problems. Springer, 235--283.
[7]
Pham Nam Khanh, Amit Kumar Singh, Akash Kumar, and Khin Mi Mi Aung. 2015. Exploiting loop-array dependencies to accelerate the design space exploration with high level synthesis. In DATE. 157--162.
[8]
Junyi Liu, Samuel Bayliss, and George A Constantinides. 2015. Offline synthesis of online dependence testing: Parametric loop pipelining for HLS. In FCCM. IEEE, 159--162.
[9]
Louis-Noel Pouchet. 2019. PolyBench/C: the Polyhedral Benchmark suite. http://web.cse.ohio-state.edu/ pouchet.2/software/polybench/.
[10]
Aditya Marphatia, Aditi Muhnot, Tanveer Sachdeva, Esha Shukla, and Lakshmi Kurup. 2013. Optimization of FCFS based resource provisioning algorithm for cloud computing. IOSR-JCE 10, 5 (2013), 1--5.
[11]
Tuan DA Nguyen and Akash Kumar. 2020. Maximizing the Serviceability of Partially Reconfigurable FPGA Systems in Multi-tenant Environment. In FPGA. 29--39.
[12]
Ronald Rivest. 1992. RFC1321: The MD5 message-digest algorithm.
[13]
Martín Safe, Jessica Carballido, Ignacio Ponzoni, and Nélida Brignole. 2004. On stopping criteria for genetic algorithms. In SBIA. Springer, 405--413.
[14]
Junnan Shan, Mario R Casu, Jordi Cortadella, Luciano Lavagno, and Mihai T Lazarescu. 2019. Exact and heuristic allocation of multi-kernel applications to multi-FPGA platforms. In DAC. 1--6.
[15]
John E Stone, David Gohara, and Guochun Shi. 2010. OpenCL: A parallel programming standard for heterogeneous computing systems. Computing in science & engineering 12, 3 (2010), 66.
[16]
N Susila and S Chandramathi. 2016. Energy efficient extended fcfs load balancing in data centers of cloud. IJAER 11, 1 (2016), 599--605.
[17]
Xuechao Wei, Yun Liang, Tao Wang, Songwu Lu, and Jason Cong. 2017. Throughput optimization for streaming applications on CPU-FPGA heterogeneous systems. In ASP-DAC. IEEE, 488--493.
[18]
Jieru Zhao, Liang Feng, Sharad Sinha, Wei Zhang, Yun Liang, and Bingsheng He. 2019. Performance modeling and directives optimization for high level synthesis on fpga. IEEE TCAD (2019).
[19]
Guanwen Zhong, Alok Prakash, Yun Liang, Tulika Mitra, and Smail Niar. 2016. Lin-analyzer: a high-level performance analysis tool for FPGA-based accelerators. In (DAC). IEEE, 1--6.
[20]
Guanwen Zhong, Alok Prakash, Siqi Wang, Yun Liang, Tulika Mitra, and Smail Niar. 2017. Design Space exploration of FPGA-based accelerators with multi-level parallelism. In (DATE). IEEE, 1141--1146.

Cited By

View all
  • (2022)AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774727(244-249)Online publication date: 14-Mar-2022
  • (2022)On the benefits of Collaborative Thread Throttling and HLS-Versioning in CPU-FPGA Environments2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)10.1109/SBCCI55532.2022.9893223(1-6)Online publication date: 22-Aug-2022
  • (2021)ETCF – Energy-Aware CPU Thread Throttling and Workload Balancing Framework for CPU-FPGA Collaborative Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628345(1-8)Online publication date: 22-Nov-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference
January 2021
930 pages
ISBN:9781450379991
DOI:10.1145/3394885
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 January 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. CPU-FPGA
  2. HLS
  3. collaborative
  4. energy
  5. makespan

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASPDAC '21
Sponsor:

Acceptance Rates

ASPDAC '21 Paper Acceptance Rate 111 of 368 submissions, 30%;
Overall Acceptance Rate 466 of 1,454 submissions, 32%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)2
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2022)AdaFlow: A Framework for Adaptive Dataflow CNN Acceleration on FPGAs2022 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE54114.2022.9774727(244-249)Online publication date: 14-Mar-2022
  • (2022)On the benefits of Collaborative Thread Throttling and HLS-Versioning in CPU-FPGA Environments2022 35th SBC/SBMicro/IEEE/ACM Symposium on Integrated Circuits and Systems Design (SBCCI)10.1109/SBCCI55532.2022.9893223(1-6)Online publication date: 22-Aug-2022
  • (2021)ETCF – Energy-Aware CPU Thread Throttling and Workload Balancing Framework for CPU-FPGA Collaborative Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628345(1-8)Online publication date: 22-Nov-2021
  • (2021)TRIPP: Transparent Resource Provisioning for Multi-Tenant CPU-GPU based Cloud Environments2021 XI Brazilian Symposium on Computing Systems Engineering (SBESC)10.1109/SBESC53686.2021.9628223(1-8)Online publication date: 22-Nov-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media