skip to main content
10.1145/2897937.2898075acmotherconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Quest for high-performance bufferless NoCs with single-cycle express paths and self-learning throttling

Published: 05 June 2016 Publication History

Abstract

Router buffers are the main reason for the Network-on-Chip's (NoC) scalable bandwidth, but consumes significant area and power. The SCEPTER bufferless NoC sets up single-cycle virtual express paths dynamically, allowing packets to traverse non-minimal paths without latency penalty. Using prioritization, bypassing, and throttling mechanisms, we maximize opportunities to use these paths while pushing bandwidth. For 64 and 256 nodes, we achieve 62% lower latency, 1.3x higher throughput, and 35% lower starvation over a baseline bufferless NoC for synthetic traffic. Full-system 36-core simulations show a 19% lower runtime, on-par performance to a buffered network, with 36% lower area, 33% lower power.

References

[1]
N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha. GARNET: A Detailed On-Chip Network Model Inside a Full-System Simulator. In ISPASS, 2009.
[2]
N. Binkert et al. The gem5 simulator. SIGARCH Comput. Archit. News, 2011.
[3]
K. Chang et al. Hat: Heterogeneous adaptive throttling for on-chip networks. In Computer Architecture and High Performance Computing (SBAC-PAD), 2012 IEEE 24th International Symposium on, 2012.
[4]
C.-H. O. Chen et al. Smart: A single-cycle reconfigurable noc for soc applications. In Design, Automation Test in Europe Conference Exhibition (DATE), 2013, pages 338--343, March 2013.
[5]
B. Daya, L. S. Peh, and A. Chandrakasan. Towards high-performance bufferless nocs with scepter. IEEE Computer Architecture Letters, 2015.
[6]
B. K. Daya et al. Scorpio: A 36-core research chip demonstrating snoopy coherence on a scalable mesh noc with in-network ordering. In Proceeding of the 41st Annual International Symposium on Computer Architecuture, ISCA '14, 2014.
[7]
C. Fallin et al. Chipper: A low-complexity bufferless deflection router. In High Performance Computer Architecture (HPCA), 2011 IEEE 17th International Symposium on, 2011.
[8]
C. Fallin et al. Minbd: Minimally-buffered deflection routing for energy-efficient interconnect. In Networks on Chip (NoCS), 2012 Sixth IEEE/ACM International Symposium on, pages 1--10, 2012.
[9]
Y. Hoskote et al. A 5-ghz mesh interconnect for a teraflops processor. IEEE Micro, 27(5):51--61, 2007.
[10]
R. Jain et al. A quantitative measure of fairness and discrimination for resource allocation in shared computer systems. Technical report, Digital Equipment Corporation, 1984.
[11]
H. Kim, Y. Kim, and J. Kim. Clumsy flow control for high-throughput bufferless on-chip networks. Computer Architecture Letters, 2013.
[12]
T. Krishna et al. Breaking the on-chip latency barrier using smart. In High Performance Computer Architecture, 2013 IEEE 19th International Symposium on, 2013.
[13]
T. Moscibroda and O. Mutlu. A case for bufferless routing in on-chip networks. In Proceedings of the 36th Annual International Symposium on Computer Architecture, ISCA '09, 2009.
[14]
M. B. Taylor et al. Evaluation of the RAW microprocessor: An exposed-wire-delay architecture for ILP and streams. In International Symposium on Computer Architecture (ISCA), June 2004.
[15]
M. Thottethodi et al. Blam: A high-performance routing algorithm for virtual cut-through networks. In Proceedings of the 17th International Symposium on Parallel and Distributed Processing, IPDPS '03, 2003.

Cited By

View all
  • (2024)Subnetwork Based Traffic Aware Rerouting for CMesh Bufferless Network-on-ChipJournal of Circuits, Systems and Computers10.1142/S021812662450207433:12Online publication date: 16-Feb-2024
  • (2023)Fast Performance Analysis for NoCs With Weighted Round-Robin Arbitration and Finite BuffersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.325066231:5(670-683)Online publication date: May-2023
  • (2023)Improving power and performance of on-chip network through virtual channel sharing and power gatingIntegration10.1016/j.vlsi.2023.10205993(102059)Online publication date: Nov-2023
  • Show More Cited By
  1. Quest for high-performance bufferless NoCs with single-cycle express paths and self-learning throttling

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    DAC '16: Proceedings of the 53rd Annual Design Automation Conference
    June 2016
    1048 pages
    ISBN:9781450342360
    DOI:10.1145/2897937
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 June 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    DAC '16

    Acceptance Rates

    Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)9
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 16 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Subnetwork Based Traffic Aware Rerouting for CMesh Bufferless Network-on-ChipJournal of Circuits, Systems and Computers10.1142/S021812662450207433:12Online publication date: 16-Feb-2024
    • (2023)Fast Performance Analysis for NoCs With Weighted Round-Robin Arbitration and Finite BuffersIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.325066231:5(670-683)Online publication date: May-2023
    • (2023)Improving power and performance of on-chip network through virtual channel sharing and power gatingIntegration10.1016/j.vlsi.2023.10205993(102059)Online publication date: Nov-2023
    • (2023)A Survey of Machine Learning for Network-on-ChipsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.104778(104778)Online publication date: Nov-2023
    • (2022)A Survey of Machine Learning for Computer Architecture and SystemsACM Computing Surveys10.1145/349452355:3(1-39)Online publication date: 3-Feb-2022
    • (2021)A Deflection-Based Deadlock Recovery Framework to Achieve High Throughput for Faulty NoCsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.303731040:10(2170-2183)Online publication date: Oct-2021
    • (2021)Reduced Worst-Case Communication Latency Using Single-Cycle Multihop Traversal Network-on-ChipIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.301544040:7(1381-1394)Online publication date: Jul-2021
    • (2021)Opportunistic Caching in NoC: Exploring Ways to Reduce Miss PenaltyIEEE Transactions on Computers10.1109/TC.2021.306996870:6(892-905)Online publication date: 1-Jun-2021
    • (2021)Theoretical Analysis and Evaluation of NoCs with Weighted Round-Robin Arbitration2021 IEEE/ACM International Conference On Computer Aided Design (ICCAD)10.1109/ICCAD51958.2021.9643448(1-9)Online publication date: 1-Nov-2021
    • (2020)Exploiting On-Chip Routers to Store Dirty Cache Blocks in Tiled Chip Multi-processors2020 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI49217.2020.00036(147-152)Online publication date: Jul-2020
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media