Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

Yadav, Sonal; Laxmi, Vijay; Kapoor, Hemangee; Gaur, Manoj Singh; Kumar, Amit

doi:10.1007/s11227-023-05208-0

Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

Published: 07 May 2023

Volume 79, pages 17208–17246, (2023)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Sonal Yadav¹,
Vijay Laxmi²,
Hemangee Kapoor³,
Manoj Singh Gaur^2,5 &
…
Amit Kumar⁴

231 Accesses
Explore all metrics

Abstract

The advancement of networks-on-chip (NoCs) is noteworthy as the number of cores increases. The bandwidth demand has grown steadily as network traffic has increased owing to high-workload applications. The NoC traffic broadly divided into control messages and data messages in which data messages are bigger in size. As NoC channel bandwidth sets in proportion to the size of the data messages, the NoC bandwidth remains underutilized during control messages transmission. This adversely affects NoC power and performance efficiency. In modern NoC architectures, multiple NoC is popular to efficiently utilize NoC bandwidth because it offers more than one physical channel for traffic communication. The conventional multiple-NoC architectures statically distribute traffic between the NoCs. This significantly affects the power-performance metrics. We have observed up to fivefold variation in energy efficiency during the analysis of static traffic distribution for multiple NoC. In this paper, we propose an adaptive distribution of control messages for multiple NoC to improve bandwidth utilization. The traversal of control messages switch between the NoC networks according to the runtime utilization of networks. The proposed adaptive distribution of control messages improves energy efficiency up to \(72.7\%\) and \(66.9\%\) on average over single-NoC and static traffic distribution in multiple NoC, respectively. The link utilization also improves by \(1.37\times\) and \(40\%\) on average over single-NoC and conventional static traffic distribution, respectively. Thus, the proposed adaptive distribution overcomes the implications of static traffic distribution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Energy-efficient NoC with multi-granularity power optimization

Article 10 September 2016

Network-on-Chip Aware Task Mappings

Congestion Aware Routing for On-Chip Communication in NoC Systems

Data Availability Statement

Not applicable.

Notes

Single-Chip Cloud Computer.
bits per second.
Giga Bytes per second.
Tera bits per second.
The number of messages.
The dotted red box exhibits volume of control messages, whereas blue-lined box (the rightmost column of the table) highlights the percentage variation in result metric.
We shall interchangeably use the term percentage/fraction/ratio/proportion.
Least Recently Used (LRU) replacement policy.
The names of fine-grained messages indicate message class, message type, and cache association.
Less-frequent messages.
Gem5 simulator facilitates various CPU models, Instruction Set Architecture (ISAs), caches, and cache-coherence protocol models.
In full system simulation, the hardware of computer system is simulated at the level of the details such that the complete software stacks from real systems can run on the simulator.
router connects with rest of the routers via four links {N, S, E, W} in North, South, East, and West directions specific to mesh topology.
RMS-recognition mining synthesis.
The data patterns further dependent on number of cores, cache size, and cache line size.
FLoating pOint oPerations: this is computation unit that is used in high-performance computing for improving accuracy in results.
these are the synchronization primitives.
The geometric mean is the standard way to represent the results for averaging normalized results [61].
for the classification of the benchmarks, please refer Table 10.
One message transmits through more than one NoC.
Each NoC carries different types of messages.

References

Yadav S (2022) Interconnect paradigm shift towards networks-on-chip in manycore processors: a review on challenges. In: Kumar R, Ahn CW, Sharma TK, Verma OP, Agarwal A (eds) Soft computing: theories and applications. Lecture notes in networks and systems. Springer, Cham, p 425
Google Scholar
Morgan AA, Hassan AS, Watheq El-Kharashi M, Tawfik A (2020) NoC\(^2\): an efficient interfacing approach for heavily-communicating NoC-based systems. IEEE Access 8(2020):185992–186011
Article Google Scholar
Zhang C, Zhao C, He J, Chen S, Zheng L, Huang K, Han W, Zhai J (2021) Critique of planetary normal mode computation: parallel algorithms, performance, and reproducibility by SCC Team From Tsinghua University. IEEE Trans Parallel Distribut Syst 32(11):2631–2634
Kang J-H, Hwang J, Hyung JS, Ryu H (2021) High-performance simulations of turbulent boundary layer flow using Intel Xeon Phi many-core processors. J Supercomput 77(9):9597–9614
Article Google Scholar
Ginosar R (2021) The plural many-core architecture-high performance at low power. In: Multi-processor system-on-chip 1: architectures, pp. 53-68
Das R, Narayanasamy S, Satpathy SK, Dreslinski RG (2013) Catnap: energy proportional multiple network-on-chip. ACM SIGARCH Comput Archit News 41(2013):320–331
Article Google Scholar
Zhou W, Ouyang Y, Li J, Dongyu X (2023) A transparent virtual channel power gating method for on-chip network routers. Integration 88:286–297
Article Google Scholar
Yadav S, Laxmi V, Gaur MS, Kapoor HK (2019) Improving static power efficiency via placement of network demultiplexer over control plane of router in multi-NoCs. In: Proceedings of 56th ACM/IEEE Design Automation Conference (DAC). IEEE, pp. 1–2
Yadav S, Raj R (2022) Power efficient network selector placement in control plane of multiple networks-on-chip. J Supercomput 78(2022):6664–6695
Article Google Scholar
Zhou W, Ouyang Y, Xu D, Huang Z, Liang H, Wen X (2023) Energy-efficient multiple network-on-chip architecture With bandwidth expansion. IEEE Trans Very Large Scale Integr (VLSI) Syst Preprint 1–14
Yoon YJ, Concer N, Petracca M, Carloni L (2010) Virtual channels vs. multiple physical networks: a comparative analysis. In: Proceedings of 47th Conference on Design Automation Conference (DAC), ACM/EDAC/IEEE, pp. 162–165
Yadav S, Laxmi V, Gaur MS (2020) Multiple-NoC exploration and customization for energy efficient traffic distribution. In: IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC). IEEE, pp. 200-201
Hesham S, Goehringer D, Abd MA, Ghany E (2020) HPPT-NoC: a dark-silicon inspired hierarchical TDM NoC with efficient power-performance trading. IEEE Trans Parallel Distrib Syst 31(3):675–694
Article Google Scholar
Shafique M, Garg S (2017) Computing in the dark silicon era: current trends and research challenges. IEEE Des Test 34(2017):8–23
Article Google Scholar
Yao Y (2023) Game-of-life temperature-aware DVFS strategy for tile-based chip many-core processor. IEEE J Emerging Sel Top Circuits Syst
Li Z, Miguel JS, Jerger NE (2016) The runahead network-on-chip. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, pp. 333–344
Liu Z, Li G, Cheng J (2023) Efficient accelerator/network co-search with circular greedy reinforcement learning. IEEE Trans Circuits Syst II
Lu H, Yan G, Han Y, Wang Y, Li X (2015) ShuttleNoC: boosting on-chip communication efficiency by enabling localized power adaptation. In: Proceedings of 20th Asia and South Pacific Design Automation Conference, IEEE, pp. 142–147
Lu H, Chang Y, Yan G, Lin N, Wei X, Li X (2019) ShuttleNoC: power-adaptable communication infrastructure for many-core processors. IEEE Trans Comput Aided Des Integr Circuits Syst 38:1438–1451
Article Google Scholar
Asadi B, Zia SM, Al-Khafaji HMR, Mohamadian A (2023) Network-on-chip and photonic network-on-chip basic concepts: a survey. J Electron Test
Yoon YJ, Concer N, Petracca M, Carloni LP (2013) Virtual channels and multiple physical networks: two alternatives to improve NoC performance. IEEE Trans Comput Aided Des Integr Circuits Syst 32:1906–1919
Article Google Scholar
Li X, Yan G, Liu C (2023) Fault-tolerant network-on-chip. In: Built-in fault-tolerant computing paradigm for resilient large-scale chip design. Springer: Singapore
Kadri N, Koudil M (2019) A survey on fault-tolerant application mapping techniques for Network-on-Chip. J Syst Archit 92:39–52
Article Google Scholar
Sepúlveda J, Flórez D, Gogniat G (2015) Reconfigurable security architecture for disrupted protection zones in NoC-based MPSoCs. In: Proceedings of 10th International Symposium on Reconfigurable Communication-Centric Systems-on-Chip (ReCoSoC). IEEE, pp. 1–8
Ali U, Sahni SAR, Khan O (2023) Characterization of timing-based software side-channel attacks and mitigations on network-on-chip hardware. J Emerg Technol Comput Syst
Baharloo M, Aligholipour R, Abdollahi M, Khonsari A (2020) ChangeSUB: a power efficient multiple network-on-chip architecture. Comput Electr Eng 83(2020):106578
Article Google Scholar
Aligholipour R, Baharloo M, Farzaneh B, Abdollahi M, Khonsari A (2021) TAMA: turn-aware mapping and architecture: a power-efficient network-on-chip approach. ACM Trans Embed Comput Syst 20(2021):1–24
Article Google Scholar
Rovinski A (2022) Towards free, open, and ubiquitous hardware design. University of Michigan, PhD dissertation
Alimi I, Aboderin O, Muga NJ, Teixeira AL (eds) (2022). IntechOpen, England
Google Scholar
Alagarsamy A, Mahilmaran S, Gopalakrishnan L, Ko S-B (2023) SaHNoC: an optimal energy efficient hybrid networks-on-chip architecture. J Supercomp 79:6538–6559
Article Google Scholar
Yadav S, Laxmi V, Gaur MS (2016) A power efficient dual link mesh NOC architecture to support nonuniform traffic arbitration at routing logic. In: Proceedings of the 29th International Conference on VLSI Design (VLSID). IEEE, pp. 69–74
Yadav S (2022) A study on requests serialization in directory-based protocol for MESI cache coherence protocol. In: Soft Computing: Theories and Applications: Proceedings of SoCTA 2021. Springer, pp. 761–768
Yadav S, Laxmi V, Kapoor HK, Gaur MS, Zwolinski M (2018) A power efficient crossbar arbitration in multi-NoC for multicast and broadcast traffic. In: Proceedings of International Conference on IEEE International Symposium on Smart Electronic Systems (IEEE-iSES). IEEE
Yadav S, Laxmi V, Gaur MS, Bhargava M (2015) C\(^2\) -DLM: cache coherence aware dual link mesh for on-chip interconnect. In: Proceedings 19th IEEE International Symposium on VLSI Design and Test, IEEE
Zhou W, Ouyang Y, Lu Y, Liang H (2022) A router architecture with dual input and dual output channels for Networks-on-Chip. Microprocess Microsyst 90:104464
Article Google Scholar
Yoon YJ (2017) Design and optimization of Networks-on-Chip for future heterogeneous systems-on-chip. Thesis of Columbia University
Volos S, Seiculescu C, Grot B, Pour NK, Falsafi B, Micheli G de (2012) CCNoC: specializing on-chip interconnects for energy efficiency in cache-coherent servers. In: IEEE/ACM Sixth International Symposium on Networks-on-Chip. IEEE, pp. 67–74
Mirhosseinia A, Sadrosadatib M, Soltanic B, Sarbazi-Azadb H (2022) A power-performance balanced network-on-chip for mixed CPU-GPU systems. Adv Comput 2022:45
Article Google Scholar
Balfour J, Dally WJ (2006) Design tradeoffs for tiled CMP on-chip networks. In: ACM International Conference on Supercomputing 25th Anniversary Volume. ACM, pp. 390–401
Kunthara RG, James RK, Sleeba SZ, Jose J (2022) DAReS: deflection aware rerouting between subnetworks in bufferless on-chip networks. In: Proceedings of the Great Lakes Symposium on VLSI, pp. 211–216
Miguel JS, Jerger NE (2015) Data criticality in network on chip design. In: Proceedings of the 9th International Symposium on Networks-on-Chip (NOCS). ACM, pp. 1–8
Mishra AK, Mutlu O, Das CR (2013) A heterogeneous multiple network-on-chip design: an application-aware approach. In: 50th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, pp. 1–10
Mandal SK, Ayoub R, Kishinevsky M, Ogras UY (2019) Analytical performance models for NoCs with multiple priority traffic classes. ACM Trans Embed Comput Syst (TECS) 18(5s):1–21
Article Google Scholar
Buckler M, Burleson W, Sadowski G (2013) Low-power networks-on-chip: progress and remaining challenges. In: International Symposium on Low Power Electronics and Design (ISLPED). IEEE, pp. 132–134
Trik M, Akhavan H, Bidgoli AM, Molk AMNG, Vashani H, Mozaffari SP (2023) A new adaptive selection strategy for reducing latency in networks on chip. Integration 89:9–24
Article Google Scholar
Trik M, Molk AMNG, Ghasemi F, Pouryeganeh P (2022) A hybrid selection strategy based on traffic analysis for improving performance in networks on chip. J Sensors, 3112170
Ofori-Attah E, Agyeman MO (2017) A survey of low power NoC design techniques. In: Proceedings of the 2nd International Workshop on Advanced Interconnect Solutions and Technologies for Emerging Computing Systems (AISTECS ’17). Association for Computing Machinery, New York, NY, USA, pp. 22–27
Singh R, Bohra M, Hemrajani P, Kalla A, Bhatt DP, Purohit N, Daneshtalab M (2022) Review, analysis, and implementation of path selection strategies for 2D NoCs. IEEE Access 10:129245–129268
Article Google Scholar
Rad F, Reshadi M, Khademzadeh A (2020) A survey and taxonomy of congestion control mechanisms in wireless network on chip. J Syst Archit 108:101807
Article Google Scholar
Fang Z, Cheng L, Vangal SR (2009) Using criticality information to route cache coherency communications. U.S. Patent US20090300292 A1
Nicopoulos CA, Park D, Kim J, Vijaykrishnan N, Yousif MS, Das CR (2006) ViChaR: a dynamic virtual channel regulator for network-on-chip routers. In: Proceedings of the Thirty-ninth IEEE/ACM International Symposium on Microarchitecture (MICRO’06), Orlando, FL, pp. 333–346
Lai M, Wang Z, Gao L, Lu H, Dai K (2008) A dynamically-allocated virtual channel architecture with congestion awareness for on-chip routers. In: Proceedings of the Forty-fifth ACM/IEEE Design Automation Conference, Anaheim, CA, pp. 630–633
Baharloo M, Khonsari A, Dolati M, Shiri P, Ebrahimi M, Rahmati D (2020) Traffic-aware performance optimization in Real-time wireless network on chip. Nano Commun Netw 26:100321
Article Google Scholar
Gogte V, Kolli A, Wenisch TF (2022) A primer on memory persistency. Synth Lect Comput Architect 1(2022):1–115
Google Scholar
Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S et al (2011) The gem5 simulator. ACM SIGARCH Comput Archit News 39(2011):1–7
Article Google Scholar
Semakin A (2021) Simulation of a multi-core computer system in the gem5 simulator. In: AIP Conference Proceedings. https://doi.org/10.1063/5.0035841
Maron CAF, Vogel A, Griebler D, Fernandes LG (2019) Should PARSEC benchmarks be more parametric? a case study with Dedup. In: 2019 27th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 217–221. IEEE
Krishna T (2017) A detailed on-chip network model inside a full-system simulator. In: Gem5 Workshop. ARM Research Summit
Zhang H, Chen Y, Huang Z, Xia C, Liang J, Gu H (2021) Comparative analysis of simulators for optical network-on-chip (ONoC). In: 12th International Symposium on Parallel Architectures, Algorithms and Programming (PAAP), IEEE, pp. 19–23
Sethi MAJ, Hussin FA, Hamid NH (2017) Network-on-Chip (NoC) topologies and performance: a review. Rev Netw Chip Archit 10(1):4–29
Google Scholar
Vogel RM (2022) The geometric mean? Commun Stat Theory Methods 51(1):82–94
Article MathSciNet MATH Google Scholar
Xiang X, Sigdel P, Tzeng N-F (2020) Bufferless network-on-chips with bridged multiple subnetworks for deflection reduction and energy savings. IEEE Trans Comput 69(2020):577–590
Article MATH Google Scholar
Baharloo M, Khonsari A (2018) A low-power wireless-assisted multiple network-on-chip. Microprocess Microsyst 63(2018):104–115
Article Google Scholar
Duraisamy K, Hao L, Pande PP, Kalyanaraman A (2016) High-performance and energy-efficient network-on-chip architectures for graph analytics. ACM Trans Embed Comput Syst 15(2016):1–26
Article Google Scholar
Dasari UK, Temam O, Narayanaswami R, Woo DH (2021) Apparatus and mechanism for processing neural network tasks using a single chip package with multiple identical dies. U.S. Patent 10, 15/819,753

Download references

Funding

Not applicable.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, National Institute of Technology Raipur, Raipur, India
Sonal Yadav
Department of Computer Science and Engineering, Malaviya National Institute of Technology Jaipur, Jaipur, India
Vijay Laxmi & Manoj Singh Gaur
Department of Computer Science and Engineering, Indian Institute of Technology Guwahati, Guwahati, India
Hemangee Kapoor
Department of Computer Science and Engineering, Indian Institute of Information Technology Kota, Jaipur, India
Amit Kumar
Indian Institute of Technology Jammu, Jammu, India
Manoj Singh Gaur

Authors

Sonal Yadav
View author publications
You can also search for this author in PubMed Google Scholar
Vijay Laxmi
View author publications
You can also search for this author in PubMed Google Scholar
Hemangee Kapoor
View author publications
You can also search for this author in PubMed Google Scholar
Manoj Singh Gaur
View author publications
You can also search for this author in PubMed Google Scholar
Amit Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

The first author originated the idea, implemented it and wrote the paper. The second, third, and fourth authors all helped to conceptualize the idea, organize the manuscript, and analyze the results. The fifth author contributes to the revision of the manuscript.

Corresponding author

Correspondence to Sonal Yadav.

Ethics declarations

Conflict of interest

The authors declare no competing interests regarding this work.

Ethical approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: Message distribution and cache state transitions

Table 15 in this appendix summarizes 41 alternative message distributions on dual-NoC that are executed for each PARSEC benchmark application described in Table 3 (Sect. 3). Figures 9 and 10 are tabular representations of cache state transition graphs, as discussed in Sect. 4, Figs. 1 and 2, respectively.

See Figs. 9 and 10.

Table 15 Each row except the first row indicates static message distribution out of 41 combinations. The static message distribution is Not Applicable (NA) for the single-NoC (first row)

Full size table

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Yadav, S., Laxmi, V., Kapoor, H. et al. Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC. J Supercomput 79, 17208–17246 (2023). https://doi.org/10.1007/s11227-023-05208-0

Download citation

Accepted: 17 March 2023
Published: 07 May 2023
Issue Date: October 2023
DOI: https://doi.org/10.1007/s11227-023-05208-0

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Adaptive distribution of control messages for improving bandwidth utilization in multiple NoC

Abstract

Access this article

Similar content being viewed by others