skip to main content
10.1145/2228360.2228512acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Architecture support for accelerator-rich CMPs

Published: 03 June 2012 Publication History

Abstract

This work discusses a hardware architectural support for accelerator-rich CMPs (ARC). First, we present a hardware resource management scheme for accelerator sharing. This scheme supports sharing and arbitration of multiple cores for a common set of accelerators, and it uses a hardware-based arbitration mechanism to provide feedback to cores to indicate the wait time before a particular resource becomes available. Second, we propose a light-weight interrupt system to reduce the OS overhead of handling interrupts which occur frequently in an accelerator-rich platform. Third, we propose architectural support that allows us to compose a larger virtual accelerator out of multiple smaller accelerators. We have also implemented a complete simulation tool-chain to verify our ARC architecture. Experimental results show significant performance (on average 51X) and energy improvement (on average 17X) compared to approaches using OS-based accelerator management.

References

[1]
Convey computer. http://conveycomputer.com/.
[2]
ITRS 2007 system drivers. http://www.itrs.net/.
[3]
Nallatech FSB - development systems. http://www.nallatech.com/Intel-Xeon-FSB-Socket-Fillers/fsb-development-systems.html.
[4]
D. Bouris et al. Fast and efficient FPGA-based feature detection employing the SURF algorithm. FCCM '10, pages 3--10.
[5]
A. Bui et al. Platform characterization for domain-specific computing. In ASPDAC, 2012.
[6]
N. Clark, et al. VEAL: Virtualized execution accelerator for loops. ISCA '08, pages 389--400.
[7]
J. Cong et al. FPGA-based hardware acceleration of lithographic aerial image simulation. ACM Trans. Reconf. Technol. Syst., pages 1--29, 2009.
[8]
J. Cong et al. Accelerating vision and navigation applications on a customizable platform. In ASAP, 2011.
[9]
J. Cong et al. AXR-CMP: Architecture support in accelerator-rich CMPs. 2nd Workshop on SoC Architecture, Accelerators and Workloads, February 2011.
[10]
J. Cong et al. High-level synthesis for FPGAs: From prototyping to deployment. Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on, 30(4):473--491, April 2011.
[11]
J. Cong, M. A. Ghodrat, M. Gill, B. Grigorian, and G. Reinman. UCLA computer science department technical report #120008.
[12]
M. Frigo et al. The design and implementation of FFTW3. Proc. of the IEEE, 93(2):216--231, 2005.
[13]
P. Garcia et al. Kernel sharing on reconfigurable multiprocessor systems. FPT 2008, pages 225--232.
[14]
J. Hauser et al. Garp: a mips processor with a reconfigurable coprocessor. FCCM'97, pages 12--21.
[15]
W. Jiang et al. Large-scale wire-speed packet classification on FPGAs. FPGA '09, pages 219--228.
[16]
C. Johnson et al. A wire-speed power#8482;processor: 2.3ghz 45nm soi with 16 cores and 64 threads. ISSCC'10, pages 104--105.
[17]
T. Johnson et al. An 8-core, 64-thread, 64-bit power efficient sparc soc (niagara2). ISPD '07, pages 2--2.
[18]
S. Li et al. McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures. MICRO 42, 2009.
[19]
P. S. Magnusson et al. Simics: A full system simulation platform. Computer, 35:50--58, 2002.
[20]
M. M. K. Martin et al. Multifacet's general execution-driven multiprocessor simulator toolset. SIGARCH Comput. Archit. News, 33, 2005.
[21]
H. Park et al. Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia application. MICRO, 2009.
[22]
M. Puschel et al. Spiral: Code generation for dsp transforms. Proc. of the IEEE, (2):232--275, 2005.
[23]
A. Ramirez et al. The SARC architecture. Micro, IEEE, 30(5):16--29, Sep 2010.
[24]
P. Schaumont et al. Domain-specific codesign for embedded security. Computer, 36:68--74, 2003.
[25]
L. Seiler et al. Larrabee: A many-core x86 arch. for visual computing. IEEE Micro, 29:10--21, 2009.
[26]
P. Stillwell et al. HiPPAI: High performance portable accelerator interface for SoCs. HiPC 2009.
[27]
N. Sun et al. Using the cryptographic accelerators in the ultrasparc t1 and t2 processors. Sun BluePrints Online, 2007.
[28]
G. Venkatesh et al. Conservation cores: reducing the energy of mature computations. ASPLOS '10.
[29]
P. H. Wang et al. EXOCHI: architecture and programming environment for a heterogeneous multi-core multithreaded system. PLDI '07.

Cited By

View all
  • (2024)AuRORA: A Full-Stack Solution for Scalable and Virtualized Accelerator IntegrationIEEE Micro10.1109/MM.2024.340954644:4(97-105)Online publication date: 13-Jun-2024
  • (2024)WOLT: Transparent Deployment of ML Workloads on Lightweight Many-Accelerator Architectures2024 IEEE 42nd International Conference on Computer Design (ICCD)10.1109/ICCD63220.2024.00102(637-644)Online publication date: 18-Nov-2024
  • (2024)Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00083(1043-1062)Online publication date: 2-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '12: Proceedings of the 49th Annual Design Automation Conference
June 2012
1357 pages
ISBN:9781450311991
DOI:10.1145/2228360
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 June 2012

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. accelerator sharing
  2. accelerator virtualization
  3. chip multiprocessor
  4. hardware accelerators

Qualifiers

  • Research-article

Funding Sources

Conference

DAC '12
Sponsor:
DAC '12: The 49th Annual Design Automation Conference 2012
June 3 - 7, 2012
California, San Francisco

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)47
  • Downloads (Last 6 weeks)4
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)AuRORA: A Full-Stack Solution for Scalable and Virtualized Accelerator IntegrationIEEE Micro10.1109/MM.2024.340954644:4(97-105)Online publication date: 13-Jun-2024
  • (2024)WOLT: Transparent Deployment of ML Workloads on Lightweight Many-Accelerator Architectures2024 IEEE 42nd International Conference on Computer Design (ICCD)10.1109/ICCD63220.2024.00102(637-644)Online publication date: 18-Nov-2024
  • (2024)Data Motion Acceleration: Chaining Cross-Domain Multi Accelerators2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00083(1043-1062)Online publication date: 2-Mar-2024
  • (2023)AuRORA: Virtualized Accelerator Orchestration for Multi-Tenant WorkloadsProceedings of the 56th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3613424.3614280(62-76)Online publication date: 28-Oct-2023
  • (2023)Cohort: Software-Oriented Acceleration for Heterogeneous SoCsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582059(105-117)Online publication date: 25-Mar-2023
  • (2023)MindCrypt: The Brain as a Random Number Generator for SoC-Based Brain-Computer Interfaces2023 IEEE 41st International Conference on Computer Design (ICCD)10.1109/ICCD58817.2023.00021(70-77)Online publication date: 6-Nov-2023
  • (2023)An Analysis of Accelerator Data-Transfer Modes in NoC-Based SoC Architectures2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363546(1-7)Online publication date: 25-Sep-2023
  • (2023)IXIAM: ISA EXtension for Integrated Accelerator ManagementIEEE Access10.1109/ACCESS.2023.326426511(33768-33791)Online publication date: 2023
  • (2023)TCADer: A Tightly Coupled Accelerator Design framework for heterogeneous system with hardware/software co-designJournal of Systems Architecture10.1016/j.sysarc.2023.102822136(102822)Online publication date: Mar-2023
  • (2023)Accelerating OpenVX Application Kernels Using Halide SchedulingJournal of Signal Processing Systems10.1007/s11265-023-01851-195:5(623-642)Online publication date: 28-Feb-2023
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media