Abstract
Multiple networks-on-chip is a popular on-chip interconnect. This parallel communication infrastructure uses more than one NoCs to facilitate customized traffic distribution. Parallel architectures improve performance, however, at the cost of huge power dissipation. We propose power efficient customized placement of network selector hardware unit in the control plane at router. A network selector hardware unit is essentially used to distribute traffic between NoCs. Conventionally, this unit is placed in the data plane at network interface. We place network selector at switch allocator and at the routing unit of the router. The placement at switch allocator is more efficient than placement at routing unit or network interface. It improves 21% static power, 29% dynamic power, and 33% critical path delay of the circuit over network interface placement.
Similar content being viewed by others
Notes
The physical capacitance can be minimized in a number of ways, including circuit style selection, transistor sizing, placement and routing, and architectural optimizations [46]. These being a part of the fabrication, hence beyond the scope of this paper.
We synonymously use the term control plane or control path.
3 and 4-bits, for our proposed architecture.
128/256/512-bits (128-bits for our proposed architecture).
Recent NoC architectures are using higher bandwidth networks (for example, a link width of 512 bits is required to sustain modern per-core bandwidth [49]) So the placement in data path significantly increases hardware overheads.
Here, \(I_{NI}\) is the number of NI links, which are the inputs to the router, and \(I_{R}\) is the number of inputs from other routers.
We place Net-Demux only for one NoC as another NoC already has separate traffic.
FLow control unITs.
Five flits for our NoC architecture.
Where K and N are the number of input bits.
Refer to Sect. 3.2.
refer to Sect. 3.1.
Switching activity has two components (1) a static component—the function of the logic’s topology, and (2) a dynamic component—the function of the timing behavior (glitching). We have not discussed the dynamic part in detail as it is out of the scope of the paper.
Low Voltage Threshold [50].
The actual area advantage would be more for SA over NI. We get these area results when we have skipped the virtual channels and virtual networks in the circuit.
The crossbar of the router dominates the critical path of the router.
References
Reddy GT, Reddy MPK, Lakshmanna K, Kaluri R, Rajput DS, Srivastava G, Baker T (2020) Analysis of dimensionality reduction techniques on big data. IEEE Access 8(2020):54776–54788
Jain A, Laxmi V, Tripathi M et al (2019) S2DIO: an extended scalable 2D mesh network-on-chip routing reconfiguration for efficient bypass of link failures. J Supercomput 75:6855–6881
Zhan J, Xie Y, Sun G (2014) NoC-sprinting: interconnect for fine-grained sprinting in the dark silicon era. In: Proceedings of the 51st Annual Design Automation Conference, pp 1–6
Flores A, Aragón JL, Acacio ME (2008) An energy consumption characterization of on-chip interconnection networks for tiled CMP architectures. J Supercomput 45:341–364
Xiang X, Sigdel P, Tzeng N-F (2020) Bufferless network-on-chips with bridged multiple subnetworks for deflection reduction and energy savings. IEEE Trans Comput 69(4):577–590
McKeown M, Fu Y, Nguyen T, Zhou Y, Balkind J, Lavrov A, Shahrad M, Payne S, Wentzlaff D (2017) Piton: a manycore processor for multitenant clouds. IEEE Micro 37(2):70–80
Sodani A, Gramunt R, Corbal J, Kim H-S, Vinod K, Chinthamani S, Hutsell S, Agarwal R, Liu Y-C (2016) Knights landing: second-generation intel Xeon Phi product. IEEE Micro 36(2):34–46
Daya BK, Chen C-HO, Subramanian S, Kwon W-C, Park S, Krishna T, Holt J, Chandrakasan AP, Peh L-S (2014) SCORPIO: a 36-core research chip demonstrating snoopy coherence on a scalable mesh NoC with in-network ordering. In: Proceedings of 41st international symposium on computer architecture (ISCA), pp 25–36
Wentzlaff D, Griffin P, Hoffmann H, Bao L, Edwards B, Ramey C, Mattina M, Miao C-C, Brown JF III, Agarwal A (2007) On-chip interconnection architecture of the tile processor. IEEE Micro 27(5):15–31
Gratz P, Kim C, Sankaralingam K, Hanson H, Shivakumar P, Keckler S, Burger D (2007) On-chip interconnection networks of the TRIPS chip. IEEE Micro 27(5):41–50
Taylor MB, Kim J, Miller J, Wentzlaff D, Ghodrat F, Greenwald B, Hoffman H, Johnson P, Lee J-W, Lee W, Ma A, Saraf A, Seneski M, Shnidman N, Strumpen V, Frank M, Amarasinghe S, Agarwal A (2002) The raw microprocessor: a computational fabric for software circuits and general-purpose programs. IEEE Micro 22(2):25–35
Wang Z, Ma S, Huang L, Lai M, Shi W (2015) Network-on-chip customizations for message passing interface primitives. Morgan Kaufmann, Networks-On-Chip, Burlington, pp 285–315
Semakin AN (2021) Simulation of a multi-core computer system in the gem5 simulator. In: AIP Conference Proceedings. 2318, 1. AIP Publishing LLC
France L, Bruguier F, Mushtaq M, Novo D, Benoit P (2021) Implementation of Rowhammer effect in gem5. In: 15ème Colloque National du GDR SoC2
Das A, Kumar A, Jose J, Palesi M (2021) Opportunistic caching in NoC: exploring ways to reduce miss penalty. IEEE Trans Comput 70(06):892–905
Dahir N, Karkar A, Palesi M, Mak T, Yakovlev A (2021) Power density aware application mapping in mesh-based network-on-chip architecture: an evolutionary multi-objective approach. Integration 81(2021):342–353
Bienia C, Kumar S, Singh JP, Li K (2008) The PARSEC benchmark suite: characterization and architectural implications. In: International Conference on Parallel Architectures and Compilation Techniques (PACT), pp 72–81
Yadav S, Laxmi V, Gaur MS, Kapoor HK (2019) Late breaking results: improving static power efficiency via placement of network demultiplexer over control plane of router in multi-NoCs. In: 2019 56th ACM/IEEE Design Automation Conference (DAC), pp 1–2
Yadav S, Laxmi V, Gaur MS (2020) Multiple-NoC exploration and customization for energy efficient traffic distribution. In: 2020 IFIP/IEEE 28th International Conference on Very Large Scale Integration (VLSI-SOC), pp 200–201
Abts D, Jerger NDE, Kim J, Gibson D, Lipasti MH (2009) Achieving predictable performance through better memory controller placement in many-core CMPs. In: Proceedings of the 36th annual international symposium on computer architecture (ISCA). ACM, pp 451–461
Zhao H, Zhang F, Chen L, Lu M (2021) A method of fast evaluation of an MC placement for network-on-chip. J Circuits Syst Comput 30(7):2150115
Hung W, Addo-Quaye C, Theocharides T, Xie Y, Vijaykrishnan N, Irwin MJ (2004) Thermal-aware IP virtualization and placement for networks-on-chip architecture. In: Proceedings of the International Conference on Computer Design. IEEE, pp 430–437
Hu J, Marculescu R (2003) Energy-aware mapping for tile-based NoC architecture under performance constraints. In: Proceedings of the Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, pp 233–239
Srinivasan K, Chatha K (2005) A technique for low energy mapping and routing in network-on-chip architectures. In: Proceedings of the international symposium on low power electronics and design. IEEE, pp 387–392
Robaei M, Zhao H (2019) Broadcast-based hybrid wired-wireless NoC for efficient data transfer in GPU of CE systems. IEEE Consum Electron Mag 8(6):62–67
Kullu P, Ar Y, Tosun S, Ozdemir S (2020) Mapping application-specific topology to mesh topology with reconfigurable switches. IET Comput Digital Tech 14(1):9–16
Bayar S, Yurdakul A (2016) An efficient mapping algorithm on 2-D mesh network-on-chip with reconfigurable switches. In: 2016 International conference on design and technology of integrated systems in nanoscale Era (DTIS), Istanbul, Turkey, pp 1–4
Baharloo M, Aligholipour R, Abdollahi M, Khonsari A (2020) ChangeSUB: a power efficient multiple network-on-chip architecture. Comput Electr Eng 83(2020):106578
Shahidinejad A, Fathi S (2018) Wireless-assisted multiple network on chip using microring resonators. Microprocess Microsyst 63(2018):190–198
Ejaz A, Papaefstathiou V, Sourdis I (2018) DDRNoC: dual data-rate network-on-chip. ACM Trans Archit Code Optim 15(2):1–24
Rad F, Reshadi M, Khademzadeh A (2021) A novel arbitration mechanism for crossbar switch in wireless network-on-chip. Cluster Comput 24(2021):1185–1198
Ma W, Gao X, Gao Y, Yu N (2021) A latency-optimized network-on-chip with rapid bypass channels. Micromachines 12(6):621
Neelkamal, Yadav S, Kapoor HK (2019) Lightweight message encoding of power-gating controller for on-time wakeup of gated router in network-on-chip. In: 2019 9th International symposium on embedded computing and system design (ISED), pp 1–6
Balfour J, Dally WJ (2006) Design tradeoffs for tiled CMP on-chip networks. In: Proceedings of the 20th Annual International Conference on Supercomputing. ACM, pp 187–198
Carara E, Calazans N, Moraes F (2007) Router architecture for high-performance NoCs. In: Proceedings of the 20th Annual Conference on Integrated Circuits and Systems Design (SBCCI). ACM, pp 111–116
Yoon YJ, Concer N, Petracca M, Carloni LP (2013) Virtual channels and multiple physical networks: two alternatives to improve NOC performance. IEEE Trans Comput Aided Des Integr Circuits Syst 32(12):1906–1919
Grot B, Hestness J, Keckler SW, Mutlu O (2009) Express cube topologies for on-chip interconnects. In: International symposium on high-performance computer architecture (HPCA). IEEE, pp 163–174
Kumar P, Pan Y, Kim J, Memik G, Choudhary A (2009) Exploring concentration and channel slicing in on-chip network router. In: Proceedings of 3rd ACM/IEEE international symposium on networks-on-chip. pp 276–285
Gómez C, Gómez ME, López P, Duato J (2008) Exploiting wiring resources on interconnection network: Increasing path diversity. Working on Parallel, Distributed, and Network-Based Processing. pp 20–29
Teimouri N, Modarressi M, Tavakkol A, Sarbazi-azad H (2011) Energy-optimized on-chip networks using reconfigurable shortcut paths. In: Conference on Architecture of Computing Systems. pp 231–242
Noh S, Ngo V-D, Jao H, Choi H-W (2006) Multiplane virtual channel router for network-on-chip design. In: First International Conference on Communications and Electronics, pp 348–351
Gilabert F, Gómez ME, Medardoni S, Bertozzi D (2010) Improved Utilization of NoC channel bandwidth by switch replication for cost-effective multi-processor systems-on-chip. In: Proceedings of fourth ACM/IEEE international symposium on networks-on-chip (NOCS), pp 165–172
Xiang X, Sigdel P, Tzeng N (2020) Bufferless network-on-chips with bridged multiple subnetworks for deflection reduction and energy savings. IEEE Trans Comput 69(4):577–590
Morgan AA, Hassan AS, El-Kharashi MW, Tawfik A (2020) \(\text{ NoC}^2\): an efficient interfacing approach for heavily-communicating NoC-based systems. IEEE Access 8:185992–186011
Volos S, Seiculescu C, Grot B, Pour NK, Falsafi B, Micheli GD (2012) CC-NoC: specializing On-chip interconnects for energy efficiency in cache-coherent servers. In: Proceedings of Sixth IEEE/ACM international symposium on networks on chip (NoCS), pp 67–74
Rabaey JM (1996) Digital integrated circuits: a design perspective. Prentice-Hall Inc., New York
Kang KW, Samsung Electronics Co Ltd (2005) Layout structures of data input/output pads and peripheral circuits of integrated circuit memory devices. U.S. Patent 6, 847, 576
Huang T-W, Lin C-X, Wong MDF (2021) OpenTimer v2: a parallel incremental timing analysis engine. IEEE Des Test 38(2):62–68
Das R, Narayanasamy S, Satpathy SK, Dreslinski RG (2013) Catnap: energy proportional multiple network-on-chip. In: Proceedings of the 40th annual international symposium on computer architecture (ISCA). ACM, pp 320–331
Bokhari H, Javaid H, Shafique M, Henkel J, Parameswaran S (2015) SuperNet: multimode interconnect architecture for manycore chips. In: Proceedings of the 52nd Annual Design Automation Conference (DAC), vol 85. ACM, pp 1–6
Bai X, Visweswariah C, Strenski PN, Hathaway DJ (2002) Uncertainty-aware circuit optimization. In: Proceedings 2002 Design Automation Conference. (IEEE Cat.No.02CH37324), pp 58–63
Fung R, Betz V, Chow W (2008) Slack Allocation and Routing to Improve FPGA Timing While Repairing Short-Path Violations. IEEE Trans Comput Aided Des Integr Circuits Syst 27(4):686–697
Dally W, Towles B (2003) Principles and practices of interconnection networks. Morgan Kaufmann Publisher, London
Binkert N, Beckmann B, Black G, Reinhardt SK, Saidi A, Basu A, Hestness J, Hower DR, Krishna T, Sardashti S, Sen R, Sewell K, Shoaib M, Vaish N, Hill MD, Wood DA (2011) The gem5 simulator. SIGARCH Comput Archit News 39(2):1–7
Agarwal N, Krishna T, Peh L-S, Jha NK (2009) Garnet: a detailed on-chip network model inside a full-system simulator. In: IEEE international symposium on performance analysis of systems and software (ISPASS). IEEE, pp 33–42
Kahng AB, Li B, Peh L-S, Samadi K (2012) ORION 2.0: a power-area simulator for interconnection networks. IEEE Trans Very Large Scale Integr (TVLSI) Syst 20(1):191–196
Miguel JS, Jerger NE (2015) Data criticality in network on chip design. In: Proceedings of the ninth IEEE/ACM international symposium on network on chip (NOCS), pp 28–30
Baharloo M, Khonsari A (2018) A low-power wireless-assisted multiple network-on-chip. Microprocess Microsyst 63(2018):104–115
Dasari UK, Temam O, Narayanaswami R, Woo D H, Google LLC (2021) Apparatus and mechanism for processing neural network tasks using a single chip package with multiple identical dies. U.S. Patent 10, 936, 942
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Yadav, S., Raj, R. Power efficient network selector placement in control plane of multiple networks-on-chip. J Supercomput 78, 6664–6695 (2022). https://doi.org/10.1007/s11227-021-04098-4
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-021-04098-4