Abstract
Traditional NoC’s buffer design mainly bases on SRAM that could not break through the high static power consumption characteristics by itself, which could be solved by emerging NVMs, such as energy-efficient RTM (Racetrack Memory). Using RTM instead of SRAM for NoC buffer design can directly reduce the static energy to near-zero level. However, RTM is not friendly to random access due to its port alignment operation, called invalid shift. This paper proposes to replace random FIFO-buffer with sequential LIFO-buffer for lightweight transmission in NoC, which can overcome the expense of invalid shift. However, the LIFO design incurs flits flipping during transmission and leads to extra endianess-correction cost in odd-path. Therefore, this paper designs a hop-parity-involved task schedule that avoids those odd-path during the communications among tasks, by which the extra endianess-correction can be totally removed. Our experiments show that RTM-LIFO buffer design can achieve over \(50\%\) energy saving than SRAM-FIFO buffer.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Li, C., Ampadu, P.: A compact low-power eDRAM-based NoC buffer. In: IEEE/ACM ISLPED, pp. 116–121 (2015). https://doi.org/10.1109/ISLPED.2015.7273500
Kline, D., Xu, H., Melhem, R., Jones, A.K.: Domain-wall memory buffer for low-energy NoCs. In: 52nd IEEE DAC, pp. 1–6 (2015). https://doi.org/10.1145/2744769.2744826
Qiu, M., Xue, C., et al.: Energy minimization with soft real-time and DVS for uniprocessor and multiprocessor embedded systems. IEEE DATE, pp. 1–6 (2007)
Gao, Y., Iqbal, S., et al.: Performance and power analysis of high-density multi-GPGPU architectures: a preliminary case study. In: IEEE 17th HPCC (2015)
Rani, K., Kapoor, H.K.: Investigating frequency scaling, nonvolatile, and hybrid memory technologies for on-chip routers to support the era of dark silicon. IEEE TCAD 40(4), 633–645 (2021). https://doi.org/10.1109/TCAD.2020.3007555
Joo, Y., et al.: Energy-and endurance-aware design of phase change memory caches. In: DATE, pp. 136–141 (2010)
Wang, J., et al.: i2WAP: improving non-volatile cache lifetime by reducing inter-and intra-set write variations. In: HPCA, pp. 234–245 (2013)
Li, Y., et al.: A software approach for combating asymmetries of non-volatile memories. In: ISLPED, pp. 191–196 (2012)
Venkatesan, R., et al.: TapeCache: a high density, energy efficient cache based on domain wall memory. In: ISLPED, pp. 185–190 (2012)
Gu, S., Sha, E., Zhuge, Q., Chen, Y., Hu, J.: Area and performance co-optimization for domain wall memory in application-specific embedded systems. In: 52nd DAC, p. 20 (2015)
Mittal, S., Vetter, J.S., Li, D.: A survey of architectural approaches for managing embedded DRAM and non-volatile on-chip caches. IEEE TPDS 26(6), 1524–1537 (2015). https://doi.org/10.1109/TPDS.2014.2324563
Kline, D., Xu, H., Melhem, R., Jones, A.K.: Racetrack queues for extremely low-energy FIFOs. IEEE TVLSI 26(8), 1531–1544 (2018). https://doi.org/10.1109/TVLSI.2018.2819945
Dong, X., Xu, C., Xie, Y., Jouppi, N.P.: NVSim: a circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE TCAD 31(7), 994–1007 (2012). https://doi.org/10.1109/TCAD.2012.2185930
Jiang, N., et al.: A detailed and flexible cycle-accurate network-onchip simulator. In: IEEE ISPASS, Austin, USA, pp. 86–96 (2013)
Topcuoglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE TPDS 13(3), 260–274 (2002). https://doi.org/10.1109/71.993206
Qiu, M., Chen, Z., Liu, M.: Low-power low-latency data allocation for hybrid scratch-pad memory. IEEE Embedd. Syst. Lett. 6(4), 69–72
Zhang, L., Qiu, M., Tseng, W., Sha, E.: Variable partitioning and scheduling for MPSoC with virtually shared scratch pad memory. J. Signal Process. Syst. 58(2), 247–265 (2010)
Zhao, H., Chen, M., et al.: A novel pre-cache schema for high performance Android system. FGCS 56, 766–772 (2016)
Guo, Y., Zhuge, Q., Hu, J., et al.: Optimal data allocation for scratch-pad memory on embedded multi-core systems. In: IEEE ICPP Conference, pp. 464–471 (2011)
Qiu, L., Gai, K., Qiu, M.: Optimal big data sharing approach for tele-health in cloud computing. In: IEEE SmartCloud, pp. 184–189 (2016)
Qiu, M., Liu, J., et al.: A novel energy-aware fault tolerance mechanism for wireless sensor networks. In: IEEE/ACM Conference on Green Computing and Communications (2011)
Xu, R., Sha, E.H.M., Zhuge, Q., Shi, L., Gu, S.: Architectural exploration on racetrack memories. In: IEEE SOCC, pp. 31–36 (2020). https://doi.org/10.1109/SOCC49529.2020.9524792
Zhao, Y., Cao, S., Yan, L.: List scheduling algorithm based on pre-scheduling for heterogeneous computing. In: IEEE International Conference on Parallel & Distributed Processing with Applications, pp. 588–595 (2019)
Acknowledgement
This work is supported by National NSF of China (No. 61802312 and 61472322), Natural Science Basic Research Plan in Shaanxi Province of China (No. 2019JQ-618), and open fund of Integrated Aero-Space-Ground-Ocean Big Data Application Technology (No. 20200105).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Cao, W., Wang, J., Wang, D., Mei, K. (2022). A Hop-Parity-Involved Task Schedule for Lightweight Racetrack-Buffer in Energy-Efficient NoCs. In: Qiu, M., Gai, K., Qiu, H. (eds) Smart Computing and Communication. SmartCom 2021. Lecture Notes in Computer Science, vol 13202. Springer, Cham. https://doi.org/10.1007/978-3-030-97774-0_25
Download citation
DOI: https://doi.org/10.1007/978-3-030-97774-0_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97773-3
Online ISBN: 978-3-030-97774-0
eBook Packages: Computer ScienceComputer Science (R0)