skip to main content
10.1145/3061639.3062278acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Low-Power On-Chip Network Providing Guaranteed Services for Snoopy Coherent and Artificial Neural Network Systems

Published: 18 June 2017 Publication History

Abstract

During the transition to packet-switched on-chip networks we lose the relative timing and ordering of requests, which are essential for shared memory coherency and the communication of spikes in hardware-based artificial neural networks. We present a bufferless network architecture that enforces a time-based sharing of multi-hop single-cycle paths, providing guaranteed services at low cost. We guarantee ordered delivery of requests, fixed network latency, and jitter-free neural spikes. In a 64-node network, we achieve a 84% lower latency and 7.5x higher throughput than SCORPIO. Full-system 36-core simulations show a 9% lower runtime than SCORPIO, with 39% lower power and 36% lower area.

References

[1]
"First the tick, now the tock: Next generation Intel microarchitecture (Nehalem)." http://www.intel.com/content/dam/doc/white-paper/intel-microarchitecture-white-paper.pdf, 2008.
[2]
"Oracle's SPARC T5-2, SPARC T5-4, SPARC T5-8, and SPARC T5-1B Server Architecture." http://www.oracle.com/technetwork/server-storage/sun-sparc-enterprise/documentation/o13-024-sparc-t5-architecture-1920540.pdf.
[3]
"Intel Xeon Processor E7 Family." http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-e7-family.html.
[4]
S. Pande, F. Morgan, G. Smit, T. Bruintjes, J. Rutgers, B. McGinley, S. Cawley, J. Harkin, and L. McDaid, "Fixed latency on-chip interconnect for hardware spiking neural network architectures," Parallel Computing, vol. 39, no. 9, pp. 357--371, 2013.
[5]
K. Goossens, J. Dielissen, and A. Radulescu, "Aethereal network on chip: concepts, architectures, and implementations," IEEE Design Test of Computers, pp. 414--421, 2005.
[6]
T. Bjerregaard and J. Sparso, "A router architecture for connection-oriented service guarantees in the mango clockless network-on-chip," in Design, Automation and Test in Europe, pp. 1226--1231 Vol. 2, 2005.
[7]
B. K. Daya, C.-H. O. Chen, S. Subramanian, W.-C. Kwon, S. Park, T. Krishna, J. Holt, A. P. Chandrakasan, and L.-S. Peh, "Scorpio: A 36-core research chip demonstrating snoopy coherence on a scalable mesh noc with in-network ordering," in Proceeding of the 41st Annual International Symposium on Computer Architecuture, ISCA '14, 2014.
[8]
T. Krishna and L.-S. Peh, "Single-cycle collective communication over a shared network fabric," in Networks-on-Chip (NoCS), 2014 Eighth IEEE/ACM International Symposium on, pp. 1--8, Sept 2014.
[9]
T. Krishna, C.-H. Chen, W. C. Kwon, and L.-S. Peh, "Breaking the on-chip latency barrier using smart," in High Performance Computer Architecture, 2013 IEEE 19th International Symposium on, 2013.
[10]
W. S. McCulloch and W. Pitts, "Neurocomputing: Foundations of research," ch. A Logical Calculus of the Ideas Immanent in Nervous Activity, pp. 15--27, 1988.
[11]
C.-H. O. Chen, S. Park, T. Krishna, S. Subramanian, A. P. Chandrakasan, and L.-S. Peh, "Smart: A single-cycle reconfigurable noc for soc applications," in Design, Automation Test in Europe Conference Exhibition (DATE), 2013, pp. 338--343, March 2013.
[12]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, S. Sardashti, R. Sen, K. Sewell, M. Shoaib, N. Vaish, M. D. Hill, and D. A. Wood, "The gem5 simulator," SIGARCH Comput. Archit. News, 2011.
[13]
N. Agarwal, T. Krishna, L.-S. Peh, and N. K. Jha, "GARNET: A Detailed On-Chip Network Model Inside a Full-System Simulator," in ISPASS, 2009.
[14]
W. J. Dally and B. Towles, Principles and Practices of Interconnection Networks. Morgan Kaufmann Publishers, 2004.
[15]
B. K. Daya, L.-S. Peh, and A. P. Chandrakasan, "Quest for high-performance bufferless nocs with single-cycle express paths and self-learning throttling," in Proceedings of the 53rd Annual Design Automation Conference, pp. 36:1--36:6, 2016.
[16]
P. McKinley, H. Xu, A.-H. Esfahanian, and L. Ni, "Unicast-based multicast communication in wormhole-routed networks," Parallel and Distributed Systems, IEEE Transactions on, pp. 1252--1265, 1994.
[17]
N. Jerger, L.-S. Peh, and M. Lipasti, "Virtual circuit tree multicasting: A case for on-chip hardware multicast support," in Computer Architecture, 2008. ISCA '08. 35th International Symposium on, pp. 229--240, June 2008.
[18]
L. Wang, Y. Jin, H. Kim, and E. J. Kim, "Recursive partitioning multicast: A bandwidth-efficient routing for networks-on-chip," in Proceedings of the 2009 3rd ACM/IEEE International Symposium on Networks-on-Chip, NOCS '09, pp. 64--73, 2009.
[19]
P. Abad, V. Puente, and J. Gregorio, "Mrr: Enabling fully adaptive multicast routing for cmp interconnection networks," in High Performance Computer Architecture, 2009. HPCA 2009. IEEE 15th International Symposium on, pp. 355--366, Feb 2009.
[20]
M. Daneshtalab, M. Ebrahimi, S. Mohammadi, and A. Afzali-Kusha, "Low-distance path-based multicast routing algorithm for network-on-chips," Computers Digital Techniques, IET, pp. 430--442, 2009.
[21]
M. Marty and M. Hill, "Coherence ordering for ring-based chip multiprocessors," in Microarchitecture, 2006. MICRO-39. 39th Annual IEEE/ACM International Symposium on, pp. 309--320, Dec 2006.
[22]
K. Strauss, X. Shen, and J. Torrellas, "Uncorq: Unconstrained Snoop Request Delivery in Embedded-Ring Multiprocessors," in MICRO, 2007.
[23]
C. Feng, Z. Lu, A. Jantsch, M. Zhang, and X. Yang, "Support efficient and fault-tolerant multicast in bufferless network-on-chip.," IEICE Transactions, pp. 1052--1061, 2012.

Cited By

View all
  • (2024)Chip and Package-Scale Interconnects for General-Purpose, Domain-Specific, and Quantum Computing Systems—Overview, Challenges, and OpportunitiesIEEE Journal on Emerging and Selected Topics in Circuits and Systems10.1109/JETCAS.2024.344582914:3(354-370)Online publication date: Sep-2024
  • (2022)Scalable Hybrid Cache Coherence Using Emerging Links for Chiplet Architectures2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID)10.1109/VLSID2022.2022.00029(92-97)Online publication date: Feb-2022
  • (2022)Performance Evaluation in 2D NoCs Using ANNAdvanced Information Networking and Applications10.1007/978-3-030-99619-2_34(360-369)Online publication date: 31-Mar-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DAC '17: Proceedings of the 54th Annual Design Automation Conference 2017
June 2017
533 pages
ISBN:9781450349277
DOI:10.1145/3061639
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2017

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

DAC '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

Upcoming Conference

DAC '25
62nd ACM/IEEE Design Automation Conference
June 22 - 26, 2025
San Francisco , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Chip and Package-Scale Interconnects for General-Purpose, Domain-Specific, and Quantum Computing Systems—Overview, Challenges, and OpportunitiesIEEE Journal on Emerging and Selected Topics in Circuits and Systems10.1109/JETCAS.2024.344582914:3(354-370)Online publication date: Sep-2024
  • (2022)Scalable Hybrid Cache Coherence Using Emerging Links for Chiplet Architectures2022 35th International Conference on VLSI Design and 2022 21st International Conference on Embedded Systems (VLSID)10.1109/VLSID2022.2022.00029(92-97)Online publication date: Feb-2022
  • (2022)Performance Evaluation in 2D NoCs Using ANNAdvanced Information Networking and Applications10.1007/978-3-030-99619-2_34(360-369)Online publication date: 31-Mar-2022
  • (2021)A Novel Hybrid Cache Coherence with Global Snooping for Many-core ArchitecturesACM Transactions on Design Automation of Electronic Systems10.1145/346277527:1(1-31)Online publication date: 13-Sep-2021
  • (2021)WiDir: A Wireless-Enabled Directory Cache Coherence Protocol2021 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA51647.2021.00034(304-317)Online publication date: Feb-2021
  • (2020)Shenjing: A low power reconfigurable neuromorphic accelerator with partial-sum and spike networks-on-chip2020 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE48585.2020.9116516(240-245)Online publication date: Mar-2020
  • (2018)StitchProceedings of the 45th Annual International Symposium on Computer Architecture10.1109/ISCA.2018.00054(575-587)Online publication date: 2-Jun-2018

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media