Skip to main content

Advertisement

Log in

Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

The advancement of networks-on-chip (NoCs) is noteworthy as the number of cores increases. The bandwidth demand has grown steadily as network traffic has increased owing to high-workload applications. The NoC traffic broadly divided into control messages and data messages in which data messages are bigger in size. As NoC channel bandwidth sets in proportion to the size of the data messages, the NoC bandwidth remains underutilized during control messages transmission. This adversely affects NoC power and performance efficiency. In modern NoC architectures, multiple NoC is popular to efficiently utilize NoC bandwidth because it offers more than one physical channel for traffic communication. The conventional multiple-NoC architectures statically distribute traffic between the NoCs. This significantly affects the power-performance metrics. We have observed up to fivefold variation in energy efficiency during the analysis of static traffic distribution for multiple NoC. In this paper, we propose an adaptive distribution of control messages for multiple NoC to improve bandwidth utilization. The traversal of control messages switch between the NoC networks according to the runtime utilization of networks. The proposed adaptive distribution of control messages improves energy efficiency up to \(72.7\%\) and \(66.9\%\) on average over single-NoC and static traffic distribution in multiple NoC, respectively. The link utilization also improves by \(1.37\times\) and \(40\%\) on average over single-NoC and conventional static traffic distribution, respectively. Thus, the proposed adaptive distribution overcomes the implications of static traffic distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data Availability Statement

Not applicable.

Notes

  1. Single-Chip Cloud Computer.

  2. bits per second.

  3. Giga Bytes per second.

  4. Tera bits per second.

  5. The number of messages.

  6. The dotted red box exhibits volume of control messages, whereas blue-lined box (the rightmost column of the table) highlights the percentage variation in result metric.

  7. We shall interchangeably use the term percentage/fraction/ratio/proportion.

  8. Least Recently Used (LRU) replacement policy.

  9. The names of fine-grained messages indicate message class, message type, and cache association.

  10. Less-frequent messages.

  11. Gem5 simulator facilitates various CPU models, Instruction Set Architecture (ISAs), caches, and cache-coherence protocol models.

  12. In full system simulation, the hardware of computer system is simulated at the level of the details such that the complete software stacks from real systems can run on the simulator.

  13. router connects with rest of the routers via four links {N, S, E, W} in North, South, East, and West directions specific to mesh topology.

  14. RMS-recognition mining synthesis.

  15. The data patterns further dependent on number of cores, cache size, and cache line size.

  16. FLoating pOint oPerations: this is computation unit that is used in high-performance computing for improving accuracy in results.

  17. these are the synchronization primitives.

  18. The geometric mean is the standard way to represent the results for averaging normalized results [61].

  19. for the classification of the benchmarks, please refer Table 10.

  20. One message transmits through more than one NoC.

  21. Each NoC carries different types of messages.

References

  1. Yadav S (2022) Interconnect paradigm shift towards networks-on-chip in manycore processors: a review on challenges. In: Kumar R, Ahn CW, Sharma TK, Verma OP, Agarwal A (eds) Soft computing: theories and applications. Lecture notes in networks and systems. Springer, Cham, p 425

    Google Scholar 

  2. Morgan AA, Hassan AS, Watheq El-Kharashi M, Tawfik A (2020) NoC\(^2\): an efficient interfacing approach for heavily-communicating NoC-based systems. IEEE Access 8(2020):185992–186011

    Article  Google Scholar 

  3. Zhang C, Zhao C, He J, Chen S, Zheng L, Huang K, Han W, Zhai J (2021) Critique of planetary normal mode computation: parallel algorithms, performance, and reproducibility by SCC Team From Tsinghua University. IEEE Trans Parallel Distribut Syst 32(11):2631–2634

  4. Kang J-H, Hwang J, Hyung JS, Ryu H (2021) High-performance simulations of turbulent boundary layer flow using Intel Xeon Phi many-core processors. J Supercomput 77(9):9597–9614

    Article  Google Scholar 

  5. Ginosar R (2021) The plural many-core architecture-high performance at low power. In: Multi-processor system-on-chip 1: architectures, pp. 53-68

  6. Das R, Narayanasamy S, Satpathy SK, Dreslinski RG (2013) Catnap: energy proportional multiple network-on-chip. ACM SIGARCH Comput Archit News 41(2013):320–331

    Article  Google Scholar 

  7. Zhou W, Ouyang Y, Li J, Dongyu X (2023) A transparent virtual channel power gating method for on-chip network routers. Integration 88:286–297

    Article  Google Scholar 

  8. Yadav S, Laxmi V, Gaur MS, Kapoor HK (2019) Improving static power efficiency via placement of network demultiplexer over control plane of router in multi-NoCs. In: Proceedings of 56th ACM/IEEE Design Automation Conference (DAC). IEEE, pp. 1–2

  9. Yadav S, Raj R (2022) Power efficient network selector placement in control plane of multiple networks-on-chip. J Supercomput 78(2022):6664–6695

    Article  Google Scholar 

  10. Zhou W, Ouyang Y, Xu D, Huang Z, Liang H, Wen X (2023) Energy-efficient multiple network-on-chip architecture With bandwidth expansion. IEEE Trans Very Large Scale Integr (VLSI) Syst Preprint 1–14

  11. Yoon YJ, Concer N, Petracca M, Carloni L (2010) Virtual channels vs. multiple physical networks: a comparative analysis. In: Proceedings of 47th Conference on Design Automation Conference (DAC), ACM/EDAC/IEEE, pp. 162–165

  12. Yadav S, Laxmi V, Gaur MS (2020) Multiple-NoC exploration and customization for energy efficient traffic distribution. In: IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC). IEEE, pp. 200-201

  13. Hesham S, Goehringer D, Abd MA, Ghany E (2020) HPPT-NoC: a dark-silicon inspired hierarchical TDM NoC with efficient power-performance trading. IEEE Trans Parallel Distrib Syst 31(3):675–694

    Article  Google Scholar 

  14. Shafique M, Garg S (2017) Computing in the dark silicon era: current trends and research challenges. IEEE Des Test 34(2017):8–23

    Article  Google Scholar 

  15. Yao Y (2023) Game-of-life temperature-aware DVFS strategy for tile-based chip many-core processor. IEEE J Emerging Sel Top Circuits Syst

  16. Li Z, Miguel JS, Jerger NE (2016) The runahead network-on-chip. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, pp. 333–344

  17. Liu Z, Li G, Cheng J (2023) Efficient accelerator/network co-search with circular greedy reinforcement learning. IEEE Trans Circuits Syst II

  18. Lu H, Yan G, Han Y, Wang Y, Li X (2015) ShuttleNoC: boosting on-chip communication efficiency by enabling localized power adaptation. In: Proceedings of 20th Asia and South Pacific Design Automation Conference, IEEE, pp. 142–147

  19. Lu H, Chang Y, Yan G, Lin N, Wei X, Li X (2019) ShuttleNoC: power-adaptable communication infrastructure for many-core processors. IEEE Trans Comput Aided Des Integr Circuits Syst 38:1438–1451

    Article  Google Scholar 

  20. Asadi B, Zia SM, Al-Khafaji HMR, Mohamadian A (2023) Network-on-chip and photonic network-on-chip basic concepts: a survey. J Electron Test

  21. Yoon YJ, Concer N, Petracca M, Carloni LP (2013) Virtual channels and multiple physical networks: two alternatives to improve NoC performance. IEEE Trans Comput Aided Des Integr Circuits Syst 32:1906–1919

    Article  Google Scholar 

  22. Li X, Yan G, Liu C (2023) Fault-tolerant network-on-chip. In: Built-in fault-tolerant computing paradigm for resilient large-scale chip design. Springer: Singapore

  23. Kadri N, Koudil M (2019) A survey on fault-tolerant application mapping techniques for Network-on-Chip. J Syst Archit 92:39–52

    Article  Google Scholar 

  24. Sepúlveda J, Flórez D, Gogniat G (2015) Reconfigurable security architecture for disrupted protection zones in NoC-based MPSoCs. In: Proceedings of 10th International Symposium on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC). IEEE, pp. 1–8

  25. Ali U, Sahni SAR, Khan O (2023) Characterization of timing-based software side-channel attacks and mitigations on network-on-chip hardware. J Emerg Technol Comput Syst

  26. Baharloo M, Aligholipour R, Abdollahi M, Khonsari A (2020) ChangeSUB: a power efficient multiple network-on-chip architecture. Comput Electr Eng 83(2020):106578

    Article  Google Scholar 

  27. Aligholipour R, Baharloo M, Farzaneh B, Abdollahi M, Khonsari A (2021) TAMA: turn-aware mapping and architecture: a power-efficient network-on-chip approach. ACM Trans Embed Comput Syst 20(2021):1–24

    Article  Google Scholar 

  28. Rovinski A (2022) Towards free, open, and ubiquitous hardware design. University of Michigan, PhD dissertation

  29. Alimi I, Aboderin O, Muga NJ, Teixeira AL (eds) (2022). IntechOpen, England

    Google Scholar 

  30. Alagarsamy A, Mahilmaran S, Gopalakrishnan L, Ko S-B (2023) SaHNoC: an optimal energy efficient hybrid networks-on-chip architecture. J Supercomp 79:6538–6559

    Article  Google Scholar 

  31. Yadav S, Laxmi V, Gaur MS (2016) A power efficient dual link mesh NOC architecture to support nonuniform traffic arbitration at routing logic. In: Proceedings of the 29th International Conference on VLSI Design (VLSID). IEEE, pp. 69–74

  32. Yadav S (2022) A study on requests serialization in directory-based protocol for MESI cache coherence protocol. In: Soft Computing: Theories and Applications: Proceedings of SoCTA 2021. Springer, pp. 761–768

  33. Yadav S, Laxmi V, Kapoor HK, Gaur MS, Zwolinski M (2018) A power efficient crossbar arbitration in multi-NoC for multicast and broadcast traffic. In: Proceedings of International Conference on IEEE International Symposium on Smart Electronic Systems (IEEE-iSES). IEEE

  34. Yadav S, Laxmi V, Gaur MS, Bhargava M (2015) C\(^2\) -DLM: cache coherence aware dual link mesh for on-chip interconnect. In: Proceedings 19th IEEE International Symposium on VLSI Design and Test, IEEE

  35. Zhou W, Ouyang Y, Lu Y, Liang H (2022) A router architecture with dual input and dual output channels for Networks-on-Chip. Microprocess Microsyst 90:104464

    Article  Google Scholar 

  36. Yoon YJ (2017) Design and optimization of Networks-on-Chip for future heterogeneous systems-on-chip. Thesis of Columbia University

  37. Volos S, Seiculescu C, Grot B, Pour NK, Falsafi B, Micheli G de (2012) CCNoC: specializing on-chip interconnects for energy efficiency in cache-coherent servers. In: IEEE/ACM Sixth International Symposium on Networks-on-Chip. IEEE, pp. 67–74

  38. Mirhosseinia A, Sadrosadatib M, Soltanic B, Sarbazi-Azadb H (2022) A power-performance balanced network-on-chip for mixed CPU-GPU systems. Adv Comput 2022:45

    Article  Google Scholar 

  39. Balfour J, Dally WJ (2006) Design tradeoffs for tiled CMP on-chip networks. In: ACM International Conference on Supercomputing 25th Anniversary Volume. ACM, pp. 390–401

  40. Kunthara RG, James RK, Sleeba SZ, Jose J (2022) DAReS: deflection aware rerouting between subnetworks in bufferless on-chip networks. In: Proceedings of the Great Lakes Symposium on VLSI, pp. 211–216

  41. Miguel JS, Jerger NE (2015) Data criticality in network on chip design. In: Proceedings of the 9th International Symposium on Networks-on-Chip (NOCS). ACM, pp. 1–8

  42. Mishra AK, Mutlu O, Das CR (2013) A heterogeneous multiple network-on-chip design: an application-aware approach. In: 50th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, pp. 1–10

  43. Mandal SK, Ayoub R, Kishinevsky M, Ogras UY (2019) Analytical performance models for NoCs with multiple priority traffic classes. ACM Trans Embed Comput Syst (TECS) 18(5s):1–21

    Article  Google Scholar 

  44. Buckler M, Burleson W, Sadowski G (2013) Low-power networks-on-chip: progress and remaining challenges. In: International Symposium on Low Power Electronics and Design (ISLPED). IEEE, pp. 132–134

  45. Trik M, Akhavan H, Bidgoli AM, Molk AMNG, Vashani H, Mozaffari SP (2023) A new adaptive selection strategy for reducing latency in networks on chip. Integration 89:9–24

    Article  Google Scholar 

  46. Trik M, Molk AMNG, Ghasemi F, Pouryeganeh P (2022) A hybrid selection strategy based on traffic analysis for improving performance in networks on chip. J Sensors, 3112170

  47. Ofori-Attah E, Agyeman MO (2017) A survey of low power NoC design techniques. In: Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems (AISTECS ’17). Association for Computing Machinery, New York, NY, USA, pp. 22–27

  48. Singh R, Bohra M, Hemrajani P, Kalla A, Bhatt DP, Purohit N, Daneshtalab M (2022) Review, analysis, and implementation of path selection strategies for 2D NoCs. IEEE Access 10:129245–129268

    Article  Google Scholar 

  49. Rad F, Reshadi M, Khademzadeh A (2020) A survey and taxonomy of congestion control mechanisms in wireless network on chip. J Syst Archit 108:101807

    Article  Google Scholar 

  50. Fang Z, Cheng L, Vangal SR (2009) Using criticality information to route cache coherency communications. U.S. Patent US20090300292 A1

  51. Nicopoulos CA, Park D, Kim J, Vijaykrishnan N, Yousif MS, Das CR (2006) ViChaR: a dynamic virtual channel regulator for network-on-chip routers. In: Proceedings of the Thirty-ninth IEEE/ACM International Symposium on Microarchitecture (MICRO’06), Orlando, FL, pp. 333–346

  52. Lai M, Wang Z, Gao L, Lu H, Dai K (2008) A dynamically-allocated virtual channel architecture with congestion awareness for on-chip routers. In: Proceedings of the Forty-fifth ACM/IEEE Design Automation Conference, Anaheim, CA, pp. 630–633

  53. Baharloo M, Khonsari A, Dolati M, Shiri P, Ebrahimi M, Rahmati D (2020) Traffic-aware performance optimization in Real-time wireless network on chip. Nano Commun Netw 26:100321

    Article  Google Scholar 

  54. Gogte V, Kolli A, Wenisch TF (2022) A primer on memory persistency. Synth Lect Comput Architect 1(2022):1–115

    Google Scholar 

  55. Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S et al (2011) The gem5 simulator. ACM SIGARCH Comput Archit News 39(2011):1–7

    Article  Google Scholar 

  56. Semakin A (2021) Simulation of a multi-core computer system in the gem5 simulator. In: AIP Conference Proceedings. https://doi.org/10.1063/5.0035841

  57. Maron CAF, Vogel A, Griebler D, Fernandes LG (2019) Should PARSEC benchmarks be more parametric? a case study with Dedup. In: 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 217–221. IEEE

  58. Krishna T (2017) A detailed on-chip network model inside a full-system simulator. In: Gem5 Workshop. ARM Research Summit

  59. Zhang H, Chen Y, Huang Z, Xia C, Liang J, Gu H (2021) Comparative analysis of simulators for optical network-on-chip (ONoC). In: 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), IEEE, pp. 19–23

  60. Sethi MAJ, Hussin FA, Hamid NH (2017) Network-on-Chip (NoC) topologies and performance: a review. Rev Netw Chip Archit 10(1):4–29

    Google Scholar 

  61. Vogel RM (2022) The geometric mean? Commun Stat Theory Methods 51(1):82–94

    Article  MathSciNet  MATH  Google Scholar 

  62. Xiang X, Sigdel P, Tzeng N-F (2020) Bufferless network-on-chips with bridged multiple subnetworks for deflection reduction and energy savings. IEEE Trans Comput 69(2020):577–590

    Article  MATH  Google Scholar 

  63. Baharloo M, Khonsari A (2018) A low-power wireless-assisted multiple network-on-chip. Microprocess Microsyst 63(2018):104–115

    Article  Google Scholar 

  64. Duraisamy K, Hao L, Pande PP, Kalyanaraman A (2016) High-performance and energy-efficient network-on-chip architectures for graph analytics. ACM Trans Embed Comput Syst 15(2016):1–26

    Article  Google Scholar 

  65. Dasari UK, Temam O, Narayanaswami R, Woo DH (2021) Apparatus and mechanism for processing neural network tasks using a single chip package with multiple identical dies. U.S. Patent 10, 15/819,753

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Authors

Contributions

The first author originated the idea, implemented it and wrote the paper. The second, third, and fourth authors all helped to conceptualize the idea, organize the manuscript, and analyze the results. The fifth author contributes to the revision of the manuscript.

Corresponding author

Correspondence to Sonal Yadav.

Ethics declarations

Conflict of interest

The authors declare no competing interests regarding this work.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Message distribution and cache state transitions

Appendix A: Message distribution and cache state transitions

Table 15 in this appendix summarizes 41 alternative message distributions on dual-NoC that are executed for each PARSEC benchmark application described in Table 3 (Sect. 3). Figures 9 and 10 are tabular representations of cache state transition graphs, as discussed in Sect.  4, Figs. 1 and 2, respectively.

See Figs. 9 and 10.

Table 15 Each row except the first row indicates static message distribution out of 41 combinations. The static message distribution is Not Applicable (NA) for the single-NoC (first row)
Fig. 9
figure 9

Tabular representation of L\(_1\) cache state transitions and respective messages. The rows and columns of table represent the stable and transient states of the L\(_1\) cache. Messages in each table cell generated as the cache state transitions from one state to another

Fig. 10
figure 10

Tabular representation of L\(_2\) cache state transitions and respective messages. The rows and columns of table represent the stable and transient states of the L\(_2\) cache. Messages in each table cell generated as the cache state transitions from one state to another

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yadav, S., Laxmi, V., Kapoor, H. et al. Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC. J Supercomput 79, 17208–17246 (2023). https://doi.org/10.1007/s11227-023-05208-0

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-023-05208-0

Keywords

Navigation