skip to main content
10.1145/3386164.3386166acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiscsicConference Proceedingsconference-collections
research-article

High Performance Heterogeneous Multicore Architectures: A Study

Published: 06 June 2020 Publication History

Abstract

The significant increase in the need for high-performance and energy-efficient computing systems has introduced heterogenous computing. However, the incorporation of different architectures into one system complicates the distribution of the workload between architectures. To address this challenge while meeting the goals of high-performance computing systems, several research contributions have been made. This paper reviews some of the proposed workload partitioning approaches for GPU-based, DSP-based and FPGA-based heterogenous systems. This research also covers some comparison studies regarding the FPGA versus DSP and FPGA versus GPU debates, showing that sometimes collaboration between these architectures seems to be the key. The aim of this study is to provide academic and industrial researchers with an insight of techniques to achieve the workload balancing in heterogenous systems and motivate them for further research in the field.

References

[1]
Agyeman, Michael Opoku, and Wen Zong. "An efficient 2D router architecture for extending the performance of inhomogeneous 3D NoC-based multi-core architectures." 2016 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW). IEEE, 2016.
[2]
Agyeman, Michael Opoku, et al. "Towards the practical design of performance-aware resilient wireless NoC architectures." 2017 7th International Conference on Cloud Computing, Data Science & Engineering-Confluence. IEEE, 2017.
[3]
Agyeman, Michael Opoku. "A study of optimization techniques for 3d networks-on-chip architectures for low power and high performance applications." International Journal of Computer Applications 121.6 (2015).
[4]
Agyeman, Michael Opoku. 3D Networks-on-Chip Architecture Optimization for Low Power Design. LAP LAMBERT Academic Publishing, 2015.
[5]
AMD (2018), The AMD Ryzen Threadripper Processor: The Second Generation Is Here[Online], Available: https://www.amd.com/en/fpartner/amd-ryzen-threadripper-2nd-generation [Accessed: 26/12/2018].
[6]
B.Cope, et al., "Performance comparison of graphics processors to reconfigurable logic: A case study", IEEE Transactions on computers, pp.433--448, 2010.
[7]
BERTEN DSP (2016), GPU vs FPGA performance comparison [Online], Available:http://www.bertendsp.com/pdf/whitepaper/BWP001_GPU_vs_FPGA_Performance_Comparison_v1.0.pdf[Accessed: 26/12/2018].
[8]
C. K. Luk, S. Hong and H. Kim, "Qilin: Exploiting parallelism on heterogenous multiprocessors with adaptive mapping", 42nd annual IEEE/ACM International Symposium on Microarchitecture, New York, pp.1--10, 2009.
[9]
D. H. Jones, et al., "GPU versus FPGA for productivity computing", International Conference on Field Programmable Logic and Applications, pp.119--124, 2010.
[10]
Dohan, Murtada, and Michael Opoku Agyeman. "A study of cache management mechanisms for real-time embedded systems." Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control. ACM, 2018.
[11]
E. Hermann, et al., "Multi-GPU and multi-CPU parallelization for interactive physics simulations", European Conference on Parallel Processing, Berlin, Heidelberg, pp.1--13, 2010.
[12]
G. Mitra, et al., "Implementation and optimization of the OpenMP accelerator model for the TI Keystone II architecture", International Workshop on OpenMP, pp.202--214, 2014.
[13]
Hung, Dao Manh Phan, Sunil Manyam Seshadri Naidu, and Michael Opoku Agyeman. "Architectures for cloud-based hpc in data centers." 2017 IEEE 2nd International Conference on Big Data Analysis (ICBDA). IEEE, 2017.
[14]
Intel (2018), Intel Acceleration Stack for Intel Xeon CPU with FPGAs [Online], Available:https://www.intel.com/content/dam/www/programmable/us/en/pdfs/literature/rn/rn-ias-v1--1.pdf [Accessed: 26/12/2018].
[15]
J. Chase, et al., "Real-time optical flow calculations on FPGA and GPU architectures: a comparison study", 16th International Symposium on Field-Programmable Custom Computing Machines, pp. 173--182, 2008.
[16]
J. E. Stone, D. Gohara and G. Shi, "OpenCL: A parallel programming standard for heterogenous computing systems", Computing in science and engineering, pp.1--2, 2010.
[17]
J. Nunez-Yanez, et al., "Simultaneous multiprocessing in a software-defined heterogeneous FPGA", The Journal of Supercomputing, pp.1--18, 2018.
[18]
K. H. Tsoi and W. Luk, "Axel: A heterogenous cluster with FPGAs and GPUs", In proceedings of the 18th annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, California, pp.1--10, 2010.
[19]
Khalsan, Mahmood Jasim, and Michael Opoku Agyeman. "An Overview of Prevention/Mitigation against Memory Corruption Attack." Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control. ACM, 2018.
[20]
M. E. Belviranli, L. N. Bhuyan and R. Gupta, "A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures", ACM Transactions on Architecture and Code Optimization (TACO), New York pp.1--20, 2013.
[21]
M. O. Agyeman and A. Ahmadinia, "An adaptive router architecture for heterogeneous 3D Networks-on-Chip," 2011 NORCHIP, Lund, 2011, pp. 1--4.
[22]
M. O. Agyeman, K. Tong and T. Mak, "Towards reliability and performance-aware Wireless Network-on-Chip design," 2015 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS), Amherst, MA, 2015, pp. 205--210.
[23]
M. Shirvaikar and T. Bushnaq, "A comparison between DSP and FPGA platforms for real-time imaging applications", Real-Time Image and Video Processing, International Society for Optics and Photonics, pp.1--10, 2009.
[24]
Margarita Espinosa Jimenez, Liliana, and Michael Opoku Agyeman. "A study of techniques to increase Instruction Level Parallelism." (2018).
[25]
Mohammedali, Noor, and Michael Opoku Agyeman. "A study of reconfigurable accelerators for cloud computing." Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control. ACM, 2018.
[26]
NVIDIA (2018), Nvidia Titan RTX [Online], Available: https://www.nvidia.com/en-gb/titan/titan-rtx/[Accessed: 26/12/2018].
[27]
Ofori-Attah, Emmanuel, and Michael Opoku Agyeman. "A survey of recent contributions of high performance NoC architectures." (2016).
[28]
Opoku Agyeman, Michael, et al. "Extending the performance of hybrid nocs beyond the limitations of network heterogeneity." Journal of Low Power Electronics and Applications 7.2 (2017): 8.
[29]
P. Meng, M. Jacobsen and R.Kastner, "FPGA-GPU-CPU heterogenous architecture for real-time cardiac physiological optical mapping", International Conference on Field-Programmable Technology (FPT), pp.1--6, 2012.
[30]
S. Amiri, et al., "Workload partitioning strategy for improved parallelism on FPGA-CPU heterogeneous chips.", 28th International Conference on Field Programmable Logic and Applications (FPL), pp.1--5, 2018.
[31]
S. Mittal and J.S. Vetter, "A survey of CPU-GPU heterogenous computing techniques", ACM Computing Surveys (CSUR), New York, p.4, 2015.
[32]
Saleh Alalaki, Muthana, and Michael Opoku Agyeman. "A study of recent contributions on simulation tools for Network-on-Chip (NoC)." International Journal of Computer Systems4.3 (2017): 33--37.
[33]
T. Hahn, et al., "Demystifying digital signal processing (DSP) programming: The ease in realizing implementations with TI DSPs", Texas Instruments Inc, Dallas, Texas, pp.1--4, 2015.
[34]
V. Kumar, et al., "Heterogeneous work-stealing across CPU and DSP cores", High Performance Extreme Computing Conference (HPEC), pp.1--6, 2015.
[35]
XILINX (2018), DSP Solutions [Online], Available: https://www.xilinx.com/products/technology/dsp.html[Accessed: 26/12/2018].

Cited By

View all
  • (2023)A Design of Homogeneous Multi-Core SoC with General-Purpose Floating-Point Processors2023 IEEE 17th International Conference on Anti-counterfeiting, Security, and Identification (ASID)10.1109/ASID60355.2023.10425983(126-130)Online publication date: 1-Dec-2023
  • (2023)An Ageing-Aware and Temperature Mapping Algorithm for Multilevel Cache NodesIEEE Access10.1109/ACCESS.2022.317408411(19162-19172)Online publication date: 2023
  • (2022)Classification Techniques for Arrhythmia Patterns Using Convolutional Neural Networks and Internet of Things (IoT) DevicesIEEE Access10.1109/ACCESS.2022.319239010(87387-87403)Online publication date: 2022

Index Terms

  1. High Performance Heterogeneous Multicore Architectures: A Study

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ISCSIC 2019: Proceedings of the 2019 3rd International Symposium on Computer Science and Intelligent Control
    September 2019
    397 pages
    ISBN:9781450376617
    DOI:10.1145/3386164
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. CPU
    2. DSP
    3. FPGA
    4. GPU
    5. Processing Capabilities
    6. Processing Units
    7. Workload Partition

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ISCSIC 2019

    Acceptance Rates

    ISCSIC 2019 Paper Acceptance Rate 77 of 152 submissions, 51%;
    Overall Acceptance Rate 192 of 401 submissions, 48%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)19
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 05 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)A Design of Homogeneous Multi-Core SoC with General-Purpose Floating-Point Processors2023 IEEE 17th International Conference on Anti-counterfeiting, Security, and Identification (ASID)10.1109/ASID60355.2023.10425983(126-130)Online publication date: 1-Dec-2023
    • (2023)An Ageing-Aware and Temperature Mapping Algorithm for Multilevel Cache NodesIEEE Access10.1109/ACCESS.2022.317408411(19162-19172)Online publication date: 2023
    • (2022)Classification Techniques for Arrhythmia Patterns Using Convolutional Neural Networks and Internet of Things (IoT) DevicesIEEE Access10.1109/ACCESS.2022.319239010(87387-87403)Online publication date: 2022

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media