Skip to main content
Log in

Stealth-ACK: stealth transmissions of NoC acknowledgements

隐形ACK:片上网络ACK包的隐形传输

  • Research Paper
  • Published:
Science China Information Sciences Aims and scope Submit manuscript

Abstract

Network-on-Chip (NoC) is a promising replacement of bus architecture due to its better scalability. In state-of-the-art NoCs, each packet contains several fixed-length flits, which facilitates allocations of network resources but brings in many unused bits. In this paper, we propose a novel technique called Stealth-ACK to effectively address the above problem. Stealth-ACK leverages unused bits in head flits of non-ACK packets to carry and stealthily transmit ACK information. Such stealth transmissions of ACK information effectively reduce not only the amount of dedicated ACK packets on NoC, but also the number of unused bits in head flits of non-ACK packets, which significantly reduces wastes on NoC bandwidth. Experimental results show that Stealth-ACK averagely increases the throughput of 16 × 16 2-D mesh NoC by 11.9%, and averagely reduces the NoC latency by 34.8% on application traces of SPLASH-2. Moreover, Stealth-ACK only requires trivial hardware modification to basic router architectures, which incurs negligible power consumption and area cost.

创新点

首先, 我们提出的利用非ACK包的头微片中未被使用的位来传输ACK信息的方法可以和多种cache一致性协议无缝组合从而减少带宽浪费; 其次, 隐形ACK传输方法提供了灵活的模式(隐藏模式和暴露模式)用于传输ACK信息, 基于此, ACK信息和非ACK包的平均延迟都得到显著下降; 最后, 为了使用隐形ACK传输方法只需要在基本路由器结构中做简单的修改, 而这种修改带来的功耗和面积开销是可忽略的。

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Vangal S, Howard J, Ruhl G, et al. An 80-tile 1.28TFLOPS network-on-chip in 65nm CMOS. In: Proceedings of International Solid-State Circuits Conference, San Francisco, 2007

    Google Scholar 

  2. Wentzlaff D, Griffin P, Hoffmann H, et al. On-chip interconnection architecture of the tile processor. In: Proceedings of International Symposium on Microarchitecture, Chicago, Illinois, USA, 2007, 27: 15–31

    Google Scholar 

  3. Dally W, Towles B. Principles and Practices of Interconnection Networks. San Francisco: Morgan Kaufmann Publishers Inc., 2003

    Google Scholar 

  4. Benini L, De Micheli G. Networks on chip: a new paradigm for systems on chip design. In: Proceedings of Design, Automation and Test in Europe Conference and Exhibition, Paris, 2002. 418–419

    Google Scholar 

  5. Dally W, Towles B. Route packets, not wires: on-chip interconnection networks. In: Proceedings of Design Automation Conference, Las Vegas, 2001. 684–689

    Google Scholar 

  6. Gratz P, Kim C, McDonald R, et al. Implementation and evaluation of on-chip network architectures. In: Proceedings of International Conference on Computer Design, San Jose, 2006. 477–484

    Google Scholar 

  7. Landin A, Hagersten E, Haridi S. Race-free interconnection networks and multiprocessor consistency. In: Proceedings of International Symposium on Computer Architecture, Toronto, 1991. 106–115

    Google Scholar 

  8. Sanchez D, Michelogiannakis G, Kozyrakis C. An analysis of on-chip interconnection networks for large-scale chip multiprocessors. ACM Trans Architect Code Optim, 2010, 7: 4

    Google Scholar 

  9. Bakhoda A, Kim J, Aamodt T. Throughput-effective on-chip networks for manycore accelerators. In: Proceedings of International Symposium on Microarchitecture, Atlanta, 2010. 421–432

    Google Scholar 

  10. Kim G, Kim J, Yoo S. FlexiBuffer: reducing leakage power in on-chip network routers. In: Proceedings of Design Automation Conference, Pacifico Yokohama, 2011. 936–941

    Google Scholar 

  11. Kim H, Kim G, Kim J. Scalable on-chip network in power constrained manycore processors. In: Proceedings of International Green Computing Conference, San Jose, 2012. 1–2

    Google Scholar 

  12. Kim H, Ghoshal P, Grot B, et al. Reducing network-on-chip energy consumption through spatial locality speculation. In: Proceedings of International Symposium on Networks-on-Chip, Pittsburgh, 2011. 233–240

    Google Scholar 

  13. Kim J. Low-cost router microarchitecture for on-chip networks. In: Proceedings of International Symposium on Mi-croarchitecture, New York City, 2009. 255–266

    Google Scholar 

  14. Owens J, Dally W, Ho R, et al. Research challenges for on-chip interconnection networks. In: Proceedings of Interna-tional Symposium on Microarchitecture, Chicago, 2007. 27: 96–108

    Google Scholar 

  15. Enright Jerger N D, Peh L S. On-Chip Networks. 1st ed. San Francisco: Morgan and Claypool Publishers, 2009

    Google Scholar 

  16. Gratz P, Grot B, Keckler S. Regional congestion awareness for load balance in networks-on-chip. In: Proceedings of International Symposium on High Performance Computer Architecture, Salt Lake City, 2008. 203–214

    Google Scholar 

  17. Ma S, Enright Jerger N B, Wang Z Y. DBAR: an efficient routing algorithm to support multiple concurrent applications in networks-on-chip. In: Proceedings of International Symposium on Computer Architecture, San Jose, 2011. 413–424

    Google Scholar 

  18. Woo S, Ohara M, Torrie E, et al. The SPLASH-2 programs: characterization and methodological considerations. In: Proceedings of International Symposium on Computer Architecture, Santa Margherita Ligure, 1995. 24–36

    Google Scholar 

  19. Peh L S, Dally W. A delay model and speculative architecture for pipelined routers. In: Proceedings of International Symposium on High-Performance Computer Architecture, Nuevo Leone, 2001. 255–266

    Chapter  Google Scholar 

  20. Galles M. Spider: a high-speed network interconnect. In: Proceedings of International Symposium on Microarchitec-ture, Research Triangle Park, 1997. 34–39

    Google Scholar 

  21. McKeown N. Whole packet forwarding: efficient design of fully adaptive routing algorithms for networks-on-chip. In: Proceedings of International Symposium on High Performance Computer Architecture, New Orleans, 2012. 1–12

    Google Scholar 

  22. McKeown N. The islip scheduling algorithm for input-queued switches. IEEE/ACM Trans Netw, 1999, 7: 188–201

    Article  Google Scholar 

  23. Kumar A, Kundu P, Singhx A, et al. A 4.6Tbits/s 3.6GHz single-cycle NoC router with a novel switch allocator in 65nm CMOS. In: Proceedings of International Conference on Computer Design, Lake Tahoe, 2007. 63–70

    Google Scholar 

  24. Intel Corporation. A touchstone delta system description. 1991

    Google Scholar 

  25. Miller J, Kasture H, Kurian G, et al. Graphite: a distributed parallel simulator for multicores. In: Proceedings of International Symposium on High Performance Computer Architecture, Bangalore, 2010. 1–12

    Google Scholar 

  26. Kim C, Burger D, Keckler S. Nonuniform cache architectures for wire-delay dominated on-chip caches. In: Proceedings of International Symposium on Microarchitecture, San Diego, 2003. 99–107

    Google Scholar 

  27. Kahng A, Li B, Peh L S, et al. ORION 2.0: a power-area simulator for interconnection networks. IEEE Trans Very Large Scale Integr Syst, 2012, 20: 191–196

    Article  Google Scholar 

  28. Li M, Zeng Q A, Jone W B. DyXY—a proximity congestion-aware deadlock-free dynamic routing method for network on chip. In: Proceedings of Design Automation Conference, San Francisco, 2006. 849–852

    Google Scholar 

  29. Singh A, Dally W, Gupta A, et al. GOAL: a load-balanced adaptive routing algorithm for torus networks. In: Pro-ceedings of International Symposium on Computer Architecture, San Diego, 2003. 194–295

    Google Scholar 

  30. Jiang N, Kim J, Dally W J. Indirect adaptive routing on large scale interconnection networks. In: Proceedings of International Symposium on Computer Architecture, Austin, 2009. 220–231

    Google Scholar 

  31. Das R, Mutlu O, Moscibroda T, et al. Aérgia: exploiting packet latency slack in on-chip networks. In: Proceedings of International Symposium on Computer Architecture, Saint-Malo, 2010

    Google Scholar 

  32. Lee J, Shin M, Kim H, et al. Exploiting mutual awareness between prefetchers and on-chip networks in multi-cores. In: Proceedings of Parallel Architectures and Compilation Techniques, Galveston, 2011. 177–178

    Google Scholar 

  33. Dally W, Aoki H. Deadlock-free adaptive routing in multicomputer networks using virtual channels. IEEE Trans Parallel Distr Syst, 1993, 4: 466–475

    Article  Google Scholar 

  34. Duato J. A new theory of deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distr Syst, 1993, 4: 1320–1331

    Article  Google Scholar 

  35. Duato J. A necessary and sufficient condition for deadlock-free adaptive routing in wormhole networks. IEEE Trans Parallel Distr Syst, 1995, 6: 1055–1067

    Article  Google Scholar 

  36. Duato J. A necessary and sufficient condition for deadlock-free routing in cut-through and store-and-forward networks. IEEE Trans Parallel Distr Syst, 1996, 7: 841–854

    Article  Google Scholar 

  37. Krishna T, Peh L S, Beckmann B M, et al. Towards the ideal on-chip fabric for 1-to-many and many-to-1 communi-cation. In: Proceedings of International Symposium on Microarchitecture, Porto Alegre, 2011. 71–82

    Google Scholar 

  38. Badr H, Podar S. An optimal shortest-path routing policy for network computers with regular mesh-connected topolo-gies. IEEE Trans Comput, 1989, 38: 1362–1371

    Article  MathSciNet  Google Scholar 

  39. Ted Nesson S L J. ROMM routing on mesh and torus networks. In: Proceedings of International Symposium on Parallelism in Algorithms and Architectures, Santa Barbara, 1995

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rui Mao.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, J., Qiu, S., Liu, S. et al. Stealth-ACK: stealth transmissions of NoC acknowledgements. Sci. China Inf. Sci. 60, 092102 (2017). https://doi.org/10.1007/s11432-015-0328-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11432-015-0328-y

Keywords

关键词

Navigation