skip to main content
10.1145/3394885.3431636acmconferencesArticle/Chapter ViewAbstractPublication PagesaspdacConference Proceedingsconference-collections
research-article

A Physical-Aware Framework for Memory Network Design Space Exploration

Published:29 January 2021Publication History

ABSTRACT

At the era of big data, there have been growing demands for server memory capacity and performance. Memory network is a promising alternative to provide high bandwidth and low latency through distributed memory nodes connected by high speed interconnect. However, most of them implement the design from a pure-logic-level and ignore the physical impact from network interconnect latency, processor placement and the interplay between processor and memory. In this work, we propose a Physical-Aware framework for memory network design space exploration, which facilitates the design of an energy efficient and physical-aware memory network system. Experimental results on various workloads show that the proposed framework can help customize network topology with significant improvements on various design metrics when compared to the other commonly used topologies.

References

  1. Subramanian S Iyer. Heterogeneous integration for performance and scaling. IEEE Transactions on Components, Packaging and Manufacturing Technology, 6(7):973--982, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  2. Seokin Hong, Prashant Jayaprakash Nair, Bulent Abali, Alper Buyuktosunoglu, Kyu-Hyoun Kim, and Michael Healy. Attache: Towards ideal memory compression by mitigating metadata bandwidth overheads. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 326--338, 2018.Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Cheng Zhuo, Shaoheng Luo, Houlex Gan, Jiang Hu, and Zhiguo Shi. Noise-aware DVFS for efficient transitions on battery-powered iot devices. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(7):1498--1510, 2020.Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Umamaheswara Rao Tida, Cheng Zhuo, Liu Leibo, and Yiyu Shi. Dynamic frequency scaling aware opportunistic through-silicon-via inductor utilization in resonant clocking. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(2):281--293, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  5. Salem Abdennadher, Michael Altmann, and Bin Xue. Challenges and emerging solutions in testing hbm io & systems. In IEEE Latin-American Test Symposium (LATS), pages 1--4, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  6. Yang Zhang, Dan Feng, Zhipeng Tan, Jingning Liu, Wei Tong, and Chengning Wang. Asymmetric-reram: A low latency and high reliability crossbar resistive memory architecture. In IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pages 330--337, 2018.Google ScholarGoogle Scholar
  7. Mauro Pelucchi, Giuseppe Psaila, and Maurizio Toccu. Hadoop vs. Spark: Impact on performance of the hammer query engine for open data corpora. Algorithms, 11(12):209, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  8. Gleari Matheus, Yu Ye, Qian Chen, L. Miller Ethan, and Zhao Jishen. String Figure: A scalable and elastic memory network architecture. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 647--660, 2019.Google ScholarGoogle Scholar
  9. Jianing Deng, Zhiguo Shi, and Cheng Zhuo. Energy efficient real-time UAV object detection on embedded platforms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10):3123--3127, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  10. Di Gao, Dayane Reis, Xiaobo Sharon Hu, and Cheng Zhuo. Eva-cim: A system-level performance and energy evaluation framework for computing-in-memory architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(12):5011--5024, 2020.Google ScholarGoogle ScholarCross RefCross Ref
  11. Pier Stanislao Paolucci, Roberto Ammendola, Andrea Biagioni, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Elena Pastorelli, Francesco Simula, and Piero Vicini. Power, energy and speed of embedded and server multi-cores applied to distributed simulation of spiking neural networks: ARM in NVIDIA Tegra vs Intel Xeon quad-cores. CoRR, abs/1505.03015, 2015.Google ScholarGoogle Scholar
  12. Haohuan Fu, Junfeng Liao, Jinzhe Yang, Lanning Wang, Zhenya Song, Xiaomeng Huang, Chao Yang, Wei Xue, Fangfang Liu, and Fangli Qiao. The sunway taihulight supercomputer: system and applications. Science China Information Sciences, 59(7):072001, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  13. Keley Mohammad, Khademzadeh Ahmad, and Hosseinzadeh Mehdi. Efficient mapping algorithm on mesh-based NoCs in terms of cellular learning automata. Int. Arab J. Inf. Technol., 16(2):312--322, 2019.Google ScholarGoogle Scholar
  14. Hossein Farrokhbakht, Hadi Mardani Kamali, and Shaahin Hessabi. SMART: A scalable mapping and routing technique for power-gating in noc routers. In IEEE/ACM International Symposium on Networks-on-Chip (NOCS), volume 15, pages 1--8, 2017.Google ScholarGoogle Scholar
  15. Abbas Dehghani and Keyvan RahimiZadeh. Design and performance evaluation of mesh-of-tree-based hierarchical wireless network-on-chip for multicore systems. Journal of Parallel and Distributed Computing, 123:100--117, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  16. Bishnoi Rimpy, Laxmi Vijay, Manoj Singh Gaur, and Mark Zwolinski. Resilient routing implementation in 2D mesh NoC. Microelectronics Reliability, 56:189--201, 2016.Google ScholarGoogle ScholarCross RefCross Ref
  17. John Kim, James Balfour, and William Dally. Flattened butterfly topology for on-chip networks. In IEEE/ACM International Symposium on Microarchitecture (MICRO), volume 6, pages 172--182, 2007.Google ScholarGoogle ScholarCross RefCross Ref
  18. John Kim, Wiliam J Dally, Steve Scott, and Dennis Abts. Technology-driven, highly-scalable dragonfly topology. In International Symposium on Computer Architecture (ISCA), pages 77--88, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. Hesam Shabani and Xiaochen Guo. Cluscross: a new topology for silicon interposer-based Network-on-Chip. In IEEE/ACM International Symposium on Networks-on-Chip, volume 7, pages 1--8, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Shpiner Alexander, Haramaty Zachy, Eliad Saar, Zdornov Vladimir, Gafni Barak, and Zahavi Eitan. Dragonfly+: Low cost topology for scaling datacenters. In IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), pages 1--8, 2017.Google ScholarGoogle Scholar
  21. Nathan L. Binkert, Bradford M. Beckmann, Gabriel Black, Steven K. Reinhardt, Ali G. Saidi, Arkaprava Basu, Joel Hestness, Derek Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib Bin Altaf, Nilay Vaish, Mark D. Hill, and David A. Wood. The gem5 simulator. SIGARCH Computer Architecture News, 39(2):1--7, 2011.Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. Niket Agarwal, Tushar Krishna, Li-Shiuan Peh, and Niraj K. Jha. GARNET: A detailed on-chip network model inside a full-system simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 33--42, 2009.Google ScholarGoogle ScholarCross RefCross Ref
  23. Ali Dorri, Salil S Kanhere, and Raja Jurdak. MOF-BC: A memory optimized and flexible blockchain for large scale networks. Future Generation Computer Systems, 92:357--373, 2019.Google ScholarGoogle ScholarCross RefCross Ref
  24. Akbar Sharifi, Emre Kultursay, Mahmut T. Kandemir, and Chita R. Das. Addressing end-to-end memory access latency in noc-based multicores. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 294--304, 2012.Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Amro Awad and Yan Solihin. STM: cloning the spatial and temporal memory access behavior. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 237--247, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  26. Bei Zhou, Yongzhong Huang, Jinchen Xu, Shaozhong Guo, and Hongyuan Qi. Memory latency optimizations for the elementary functions on the sunway architecture. The Journal of Supercomputing, 75(7):3917--3944, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. Shuang Wang, Jianzhong Huang, Xiao Qin, Qiang Cao, and Changsheng Xie. WPS: A workload-aware placement scheme for erasure-coded in-memory stores. In International Conference on Networking, Architecture, and Storage (NAS), pages 1--10, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  28. Shixiong Qi, Huaxi Gu, Haibo Zhang, and Yawen Chen. Testudo: A low latency and High-Efficient Memory-Centric Network using optical interconnect. In IEEE Global Communications Conference (GLOBECOM), pages 1--6, 2017.Google ScholarGoogle ScholarCross RefCross Ref
  29. Subodha Charles, Alif Ahmed, Ümit Y. Ogras, and Prabhat Mishra. Efficient cache reconfiguration using machine learning in NoC-based Many-Core CMPs. ACM Transactions on Design Automation of Electronic Systems, 24(6):60:1--60:23, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. Cheng Zhuo, Kassan Unda, Yiyu Shi, and Wei-Kai Shih. From layout to system: Early stage power delivery and architecture co-exploration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(7):1291--1304, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  31. Ye Yu and Chen Qian. Space Shuffle: A scalable, flexible, and high-bandwidth data center network. In IEEE International Conference on Network Protocols (ICNP), pages 13--24, 2014.Google ScholarGoogle Scholar
  32. Dennis Abts, Natalie D. Enright Jerger, John Kim, Dan Gibson, and Mikko H. Lipasti. Achieving predictable performance through better memory controller placement in many-core CMPs. In International Symposium on Computer Architecture (ISCA), pages 451--461, 2009.Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. A scalable processing-in-memory accelerator for parallel graph processing. In International Symposium on Computer Architecture, pages 105--117, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. Matthew Schuchhardt, Abhishek Das, Nikos Hardavellas, Gokhan Memik, and Alok N. Choudhary. The impact of dynamic directories on multicore interconnects. IEEE Computer, 46(10):32--39, 2013.Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. Venkata Yaswanth Raparti and Sudeep Pasricha. RAPID: memory-aware NoC for latency optimized GPGPU architectures. IEEE Transactions on Multi-Scale Computing Systems, 4(4):874--887, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  36. Manish Gupta, Vilas Sridharan, David Roberts, Andreas Prodromou, Ashish Venkat, Dean M. Tullsen, and Rajesh K. Gupta. Reliability-aware data placement for heterogeneous memory architecture. In IEEE International Symposium on High Performance Computer Architecture HPCA, pages 583--595, 2018.Google ScholarGoogle ScholarCross RefCross Ref
  37. Roopak Sinha, Barry Dowdeswell, Gulnara Zhabelova, and Valeriy Vyatkin. TORUS: scalable requirements traceability for large-scale cyber-physical systems. ACM Transactions on Cyber-Physical Systems, 3(2):15:1--15:25, 2019.Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. Jan Heisswolf, Simon Bischof, Michael Rückauer, and Jürgen Becker. Efficient memory access in 2D Mesh NoC architectures using high bandwidth routers. In Symposium on Integrated Circuits and Systems Design (SBCCI), pages 1--6, 2013.Google ScholarGoogle ScholarCross RefCross Ref

Index Terms

  1. A Physical-Aware Framework for Memory Network Design Space Exploration

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference
          January 2021
          930 pages
          ISBN:9781450379991
          DOI:10.1145/3394885

          Copyright © 2021 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 29 January 2021

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • research-article
          • Research
          • Refereed limited

          Acceptance Rates

          ASPDAC '21 Paper Acceptance Rate111of368submissions,30%Overall Acceptance Rate466of1,454submissions,32%

          Upcoming Conference

          ASPDAC '25
        • Article Metrics

          • Downloads (Last 12 months)11
          • Downloads (Last 6 weeks)4

          Other Metrics

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader