skip to main content
10.1145/2744769.2744879acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

On-chip interconnection network for accelerator-rich architectures

Published: 07 June 2015 Publication History

Abstract

Modern processors have included hardware accelerators to provide high computation capability and low energy consumption. With specific hardware implementation, accelerators can improve performance and reduce energy consumption by orders of magnitude compared to general purpose cores. However, hardware accelerators cannot tolerate memory and communication latency through extensive multi-threading; this increases the demand for efficient memory interface and network-on-chip (NoC) designs.
In this paper we explore the global management of NoCs in accelerator-rich architectures to provide predictable performance and energy efficiency. Accelerator memory accesses exhibit predictable patterns, creating highly utilized network paths. Leveraging these observations we propose reserving NoC paths based on the timing information from the global manager. We further maximize the benefit of paths reservation by regularizing the communication traffic through TLB buffering and hybrid-switching. The combined effect of these optimizations reduces the total execution time by 11.3% over a packet-switched mesh NoC and 8.5% over the EVC [18] and a previous hybrid-switched NoC [29].

References

[1]
"The mobile robot programming toolkit." {Online}. Available: http://www.mrpt.org/
[2]
N. Agarwal et al., "Garnet: A detailed on-chip network model inside a full-system simulator," in ISPASS, April, pp. 33--42.
[3]
A. Bakhoda et al., "Throughput-effective on-chip networks for manycore accelerators," in MICRO, 2010.
[4]
C. Bienia, "Benchmarking modern multiprocessors," Ph.D. dissertation, Princeton University, January 2011.
[5]
N. Clark et al., "Veal: Virtualized execution accelerator for loops," in ISCA, 2008.
[6]
J. Cong et al., "Customizable domain-specific computing," Design Test of Computers, IEEE, pp. 6--15, March 2011.
[7]
J. Cong et al., "Architecture support for accelerator-rich cmps," in DAC, 2012.
[8]
J. Cong et al., "Architecture support for domain-specific accelerator-rich cmps," ACM TECS, vol. 13, no. 4s, pp. 131:1--131:26, 2014.
[9]
J. Cong et al., "Bin: a buffer-in-nuca scheme for accelerator-rich cmps," in ISLPED, 2012.
[10]
J. Cong et al., "Optimization of interconnects between accelerators and shared memories in dark silicon," in ICCAD, 2013.
[11]
C. F. Fajardo et al., "Buffer-integrated-cache: a cost-effective sram architecture for handheld and embedded platforms," in DAC, 2011.
[12]
H. Franke et al., "Introduction to the wire-speed processor and architecture," IBM Journal of Research and Development, vol. 54, no. 1, pp. 3--1, 2010.
[13]
K. Goossens et al., "Æthereal network on chip: concepts, architectures, and implementations," Design Test of Computers, IEEE, vol. 22, no. 5, pp. 414--421, 2005.
[14]
J. R. Hauser et al., "Garp: A mips processor with a reconfigurable coprocessor," in FPT, 1997.
[15]
R. Hou et al., "Efficient data streaming with on-chip accelerators: Opportunities and challenges," in HPCA, 2011.
[16]
N. D. E. Jerger et al., "Circuit-switched coherence," in NOCS, 2008.
[17]
A. B. Kahng et al., "Orion 2.0: A fast and accurate noc power and area model for early-stage design space exploration," in DATE, 2009.
[18]
A. Kumar et al., "Express virtual channels: Towards the ideal interconnection fabric," in ISCA, 2007.
[19]
M. J. Lyons et al., "The accelerator store: a shared memory framework for accelerator-based systems," TACO, vol. 8, no. 4, p. 48, 2012.
[20]
P. Magnusson et al., "Simics: A full system simulation platform," Computer, vol. 35, no. 2, pp. 50--58, Feb.
[21]
M. M. K. Martin et al., "Multifacet's general execution-driven multiprocessor simulator (gems) toolset," SIGARCH Computer Architecture News, 2005.
[22]
U. Nawathe et al., "An 8-core, 64-thread, 64-bit, power efficient sparc soc (niagara 2)," ISSCC, 2007.
[23]
H. Park et al., "Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications," in MICRO, 2009.
[24]
L. Seiler et al., "Larrabee: a many-core x86 architecture for visual computing," ACM Transactions on Graphics (TOG), vol. 27, no. 3, p. 18, 2008.
[25]
S. R. Vangal et al., "An 80-tile sub-100-w teraflops processor in 65-nm cmos," Solid-State Circuits, vol. 43, no. 1, pp. 29--41, 2008.
[26]
D. Wentzlaff et al., "On-chip interconnection architecture of the tile processor," Micro, IEEE, pp. 15--31, 2007.
[27]
D. Wiklund et al., "Socbus: Switched network on chip for hard real time embedded systems," in IPDPS, 2003.
[28]
P. T. Wolkotte et al., "An energy-efficient reconfigurable circuit-switched network-on-chip," in IPDPS, 2005.
[29]
J. Yin et al., "Energy-efficient time-division multiplexed hybrid-switched noc for heterogeneous multicore systems," in IPDPS, 2014.

Cited By

View all
  • (2023)TB-TBP: a task-based adaptive routing algorithm for network-on-chip in heterogenous CPU-GPU architecturesThe Journal of Supercomputing10.1007/s11227-023-05700-780:5(6311-6335)Online publication date: 23-Oct-2023
  • (2022)Upward Packet Popup for Deadlock Freedom in Modular Chiplet-Based Systems2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00076(986-1000)Online publication date: Apr-2022
  • (2022)Enabling circuit-switching in modern on-chip networksMicroprocessors and Microsystems10.1016/j.micpro.2022.10471295(104712)Online publication date: Nov-2022
  • Show More Cited By

Index Terms

  1. On-chip interconnection network for accelerator-rich architectures

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        DAC '15: Proceedings of the 52nd Annual Design Automation Conference
        June 2015
        1204 pages
        ISBN:9781450335201
        DOI:10.1145/2744769
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 June 2015

        Permissions

        Request permissions for this article.

        Check for updates

        Qualifiers

        • Research-article

        Funding Sources

        Conference

        DAC '15
        Sponsor:
        DAC '15: The 52nd Annual Design Automation Conference 2015
        June 7 - 11, 2015
        California, San Francisco

        Acceptance Rates

        Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

        Upcoming Conference

        DAC '25
        62nd ACM/IEEE Design Automation Conference
        June 22 - 26, 2025
        San Francisco , CA , USA

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)26
        • Downloads (Last 6 weeks)1
        Reflects downloads up to 03 Mar 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)TB-TBP: a task-based adaptive routing algorithm for network-on-chip in heterogenous CPU-GPU architecturesThe Journal of Supercomputing10.1007/s11227-023-05700-780:5(6311-6335)Online publication date: 23-Oct-2023
        • (2022)Upward Packet Popup for Deadlock Freedom in Modular Chiplet-Based Systems2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00076(986-1000)Online publication date: Apr-2022
        • (2022)Enabling circuit-switching in modern on-chip networksMicroprocessors and Microsystems10.1016/j.micpro.2022.10471295(104712)Online publication date: Nov-2022
        • (2021)ALPHA: A Learning-Enabled High-Performance Network-on-Chip Router Design for Heterogeneous Manycore ArchitecturesIEEE Transactions on Sustainable Computing10.1109/TSUSC.2020.29813406:2(274-288)Online publication date: 1-Apr-2021
        • (2021)Local Traffic-Based Energy-Efficient Hybrid Switching for On-Chip Networks2021 29th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP52278.2021.00039(198-206)Online publication date: Mar-2021
        • (2020)Energy-Efficient On-Chip Networks through Profiled Hybrid SwitchingProceedings of the 2020 on Great Lakes Symposium on VLSI10.1145/3386263.3406934(241-246)Online publication date: 7-Sep-2020
        • (2020)A comprehensive methodology to determine optimal coherence interfaces for many-accelerator SoCsProceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design10.1145/3370748.3406564(145-150)Online publication date: 10-Aug-2020
        • (2020)In-Network Memory Access Ordering for Heterogeneous Multicore Systems2020 14th IEEE/ACM International Symposium on Networks-on-Chip (NOCS)10.1109/NOCS50636.2020.9241583(1-8)Online publication date: 24-Sep-2020
        • (2019)Is Your Bus Arbiter Really Fair? Restoring Fairness in AXI Interconnects for FPGA SoCsACM Transactions on Embedded Computing Systems10.1145/335818318:5s(1-22)Online publication date: 7-Oct-2019
        • (2019)Achieving Flexible Global Reconfiguration in NoCs using Reconfigurable RingsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.2940190(1-1)Online publication date: 2019
        • Show More Cited By

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media