Abstract
Network-on-Chip (NoC) is a promising replacement of bus architecture due to its better scalability. In state-of-the-art NoCs, each packet contains several fixed-length flits, which facilitates allocations of network resources but brings in many unused bits. In this paper, we propose a novel technique called Stealth-ACK to effectively address the above problem. Stealth-ACK leverages unused bits in head flits of non-ACK packets to carry and stealthily transmit ACK information. Such stealth transmissions of ACK information effectively reduce not only the amount of dedicated ACK packets on NoC, but also the number of unused bits in head flits of non-ACK packets, which significantly reduces wastes on NoC bandwidth. Experimental results show that Stealth-ACK averagely increases the throughput of 16 × 16 2-D mesh NoC by 11.9%, and averagely reduces the NoC latency by 34.8% on application traces of SPLASH-2. Moreover, Stealth-ACK only requires trivial hardware modification to basic router architectures, which incurs negligible power consumption and area cost.
创新点
首先, 我们提出的利用非ACK包的头微片中未被使用的位来传输ACK信息的方法可以和多种cache一致性协议无缝组合从而减少带宽浪费; 其次, 隐形ACK传输方法提供了灵活的模式(隐藏模式和暴露模式)用于传输ACK信息, 基于此, ACK信息和非ACK包的平均延迟都得到显著下降; 最后, 为了使用隐形ACK传输方法只需要在基本路由器结构中做简单的修改, 而这种修改带来的功耗和面积开销是可忽略的。
Similar content being viewed by others
References
Vangal S, Howard J, Ruhl G, et al. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. In: Proceedings of International Solid-State Circuits Conference, San Francisco, 2007
Wentzlaff D, Griffin P, Hoffmann H, et al. On-chip interconnection architecture of the tile processor. In: Proceedings of International Symposium on Microarchitecture, Chicago, Illinois, USA, 2007, 27: 15–31
Dally W, Towles B. Principles and Practices of Interconnection Networks. San Francisco: Morgan Kaufmann Publishers Inc., 2003
Benini L, De Micheli G. Networks on chip: a new paradigm for systems on chip design. In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition, Paris, 2002. 418–419
Dally W, Towles B. Route packets, not wires: on-chip interconnection networks. In: Proceedings of Design Automation Conference, Las Vegas, 2001. 684–689
Gratz P, Kim C, McDonald R, et al. Implementation and evaluation of on-chip network architectures. In: Proceedings of International Conference on Computer Design, San Jose, 2006. 477–484
Landin A, Hagersten E, Haridi S. Race-free interconnection networks and multiprocessor consistency. In: Proceedings of International Symposium on Computer Architecture, Toronto, 1991. 106–115
Sanchez D, Michelogiannakis G, Kozyrakis C. An analysis of on-chip interconnection networks for large-scale chip multiprocessors. ACM Trans Architect Code Optim, 2010, 7: 4
Bakhoda A, Kim J, Aamodt T. Throughput-effective on-chip networks for manycore accelerators. In: Proceedings of International Symposium on Microarchitecture, Atlanta, 2010. 421–432
Kim G, Kim J, Yoo S. FlexiBuffer: reducing leakage power in on-chip network routers. In: Proceedings of Design Automation Conference, Pacifico Yokohama, 2011. 936–941
Kim H, Kim G, Kim J. Scalable on-chip network in power constrained manycore processors. In: Proceedings of International Green Computing Conference, San Jose, 2012. 1–2
Kim H, Ghoshal P, Grot B, et al. Reducing network-on-chip energy consumption through spatial locality speculation. In: Proceedings of International Symposium on Networks-on-Chip, Pittsburgh, 2011. 233–240
Kim J. Low-cost router microarchitecture for on-chip networks. In: Proceedings of International Symposium on Mi-croarchitecture, New York City, 2009. 255–266
Owens J, Dally W, Ho R, et al. Research challenges for on-chip interconnection networks. In: Proceedings of Interna-tional Symposium on Microarchitecture, Chicago, 2007. 27: 96–108
Enright Jerger N D, Peh L S. On-Chip Networks. 1st ed. San Francisco: Morgan and Claypool Publishers, 2009
Gratz P, Grot B, Keckler S. Regional congestion awareness for load balance in networks-on-chip. In: Proceedings of International Symposium on High Performance Computer Architecture, Salt Lake City, 2008. 203–214
Ma S, Enright Jerger N B, Wang Z Y. DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In: Proceedings of International Symposium on Computer Architecture, San Jose, 2011. 413–424
Woo S, Ohara M, Torrie E, et al. The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of International Symposium on Computer Architecture, Santa Margherita Ligure, 1995. 24–36
Peh L S, Dally W. A delay model and speculative architecture for pipelined routers. In: Proceedings of International Symposium on High-Performance Computer Architecture, Nuevo Leone, 2001. 255–266
Galles M. Spider: a high-speed network interconnect. In: Proceedings of International Symposium on Microarchitec-ture, Research Triangle Park, 1997. 34–39
McKeown N. Whole packet forwarding: efficient design of fully adaptive routing algorithms for networks-on-chip. In: Proceedings of International Symposium on High Performance Computer Architecture, New Orleans, 2012. 1–12
McKeown N. The islip scheduling algorithm for input-queued switches. IEEE/ACM Trans Netw, 1999, 7: 188–201
Kumar A, Kundu P, Singhx A, et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In: Proceedings of International Conference on Computer Design, Lake Tahoe, 2007. 63–70
Intel Corporation. A touchstone delta system description. 1991
Miller J, Kasture H, Kurian G, et al. Graphite: a distributed parallel simulator for multicores. In: Proceedings of International Symposium on High Performance Computer Architecture, Bangalore, 2010. 1–12
Kim C, Burger D, Keckler S. Nonuniform cache architectures for wire-delay dominated on-chip caches. In: Proceedings of International Symposium on Microarchitecture, San Diego, 2003. 99–107
Kahng A, Li B, Peh L S, et al. ORION 2.0: a power-area simulator for interconnection networks. IEEE Trans Very Large Scale Integr Syst, 2012, 20: 191–196
Li M, Zeng Q A, Jone W B. DyXY—a proximity congestion-aware deadlock-free dynamic routing method for network on chip. In: Proceedings of Design Automation Conference, San Francisco, 2006. 849–852
Singh A, Dally W, Gupta A, et al. GOAL: a load-balanced adaptive routing algorithm for torus networks. In: Pro-ceedings of International Symposium on Computer Architecture, San Diego, 2003. 194–295
Jiang N, Kim J, Dally W J. Indirect adaptive routing on large scale interconnection networks. In: Proceedings of International Symposium on Computer Architecture, Austin, 2009. 220–231
Das R, Mutlu O, Moscibroda T, et al. Aérgia: exploiting packet latency slack in on-chip networks. In: Proceedings of International Symposium on Computer Architecture, Saint-Malo, 2010
Lee J, Shin M, Kim H, et al. Exploiting mutual awareness between prefetchers and on-chip networks in multi-cores. In: Proceedings of Parallel Architectures and Compilation Techniques, Galveston, 2011. 177–178
Dally W, Aoki H. Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans Parallel Distr Syst, 1993, 4: 466–475
Duato J. A new theory of deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distr Syst, 1993, 4: 1320–1331
Duato J. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distr Syst, 1995, 6: 1055–1067
Duato J. A necessary and sufficient condition for deadlock-free routing in cut-through and store-and-forward networks. IEEE Trans Parallel Distr Syst, 1996, 7: 841–854
Krishna T, Peh L S, Beckmann B M, et al. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communi-cation. In: Proceedings of International Symposium on Microarchitecture, Porto Alegre, 2011. 71–82
Badr H, Podar S. An optimal shortest-path routing policy for network computers with regular mesh-connected topolo-gies. IEEE Trans Comput, 1989, 38: 1362–1371
Ted Nesson S L J. ROMM routing on mesh and torus networks. In: Proceedings of International Symposium on Parallelism in Algorithms and Architectures, Santa Barbara, 1995
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Tao, J., Qiu, S., Liu, S. et al. Stealth-ACK: stealth transmissions of NoC acknowledgements. Sci. China Inf. Sci. 60, 092102 (2017). https://doi.org/10.1007/s11432-015-0328-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11432-015-0328-y