research-article

A Physical-Aware Framework for Memory Network Design Space Exploration

Authors:
Tianhao Shen

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Di Gao

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Li Zhang

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

,
Jishen Zhao

University of California, San Diego, CA, USA

University of California, San Diego, CA, USA
View Profile

,
Cheng Zhuo

Zhejiang University, Hangzhou, China

Zhejiang University, Hangzhou, China
View Profile

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation ConferenceJanuary 2021Pages 865–871https://doi.org/10.1145/3394885.3431636

Published:29 January 2021Publication History

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

Pages 865–871

ABSTRACT

At the era of big data, there have been growing demands for server memory capacity and performance. Memory network is a promising alternative to provide high bandwidth and low latency through distributed memory nodes connected by high speed interconnect. However, most of them implement the design from a pure-logic-level and ignore the physical impact from network interconnect latency, processor placement and the interplay between processor and memory. In this work, we propose a Physical-Aware framework for memory network design space exploration, which facilitates the design of an energy efficient and physical-aware memory network system. Experimental results on various workloads show that the proposed framework can help customize network topology with significant improvements on various design metrics when compared to the other commonly used topologies.

References

Subramanian S Iyer. Heterogeneous integration for performance and scaling. IEEE Transactions on Components, Packaging and Manufacturing Technology, 6(7):973--982, 2016.Google ScholarCross Ref
Seokin Hong, Prashant Jayaprakash Nair, Bulent Abali, Alper Buyuktosunoglu, Kyu-Hyoun Kim, and Michael Healy. Attache: Towards ideal memory compression by mitigating metadata bandwidth overheads. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 326--338, 2018.Google ScholarDigital Library
Cheng Zhuo, Shaoheng Luo, Houlex Gan, Jiang Hu, and Zhiguo Shi. Noise-aware DVFS for efficient transitions on battery-powered iot devices. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(7):1498--1510, 2020.Google ScholarDigital Library
Umamaheswara Rao Tida, Cheng Zhuo, Liu Leibo, and Yiyu Shi. Dynamic frequency scaling aware opportunistic through-silicon-via inductor utilization in resonant clocking. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(2):281--293, 2020.Google ScholarCross Ref
Salem Abdennadher, Michael Altmann, and Bin Xue. Challenges and emerging solutions in testing hbm io & systems. In IEEE Latin-American Test Symposium (LATS), pages 1--4, 2018.Google ScholarCross Ref
Yang Zhang, Dan Feng, Zhipeng Tan, Jingning Liu, Wei Tong, and Chengning Wang. Asymmetric-reram: A low latency and high reliability crossbar resistive memory architecture. In IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom), pages 330--337, 2018.Google Scholar
Mauro Pelucchi, Giuseppe Psaila, and Maurizio Toccu. Hadoop vs. Spark: Impact on performance of the hammer query engine for open data corpora. Algorithms, 11(12):209, 2018.Google ScholarCross Ref
Gleari Matheus, Yu Ye, Qian Chen, L. Miller Ethan, and Zhao Jishen. String Figure: A scalable and elastic memory network architecture. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 647--660, 2019.Google Scholar
Jianing Deng, Zhiguo Shi, and Cheng Zhuo. Energy efficient real-time UAV object detection on embedded platforms. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(10):3123--3127, 2020.Google ScholarCross Ref
Di Gao, Dayane Reis, Xiaobo Sharon Hu, and Cheng Zhuo. Eva-cim: A system-level performance and energy evaluation framework for computing-in-memory architectures. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 39(12):5011--5024, 2020.Google ScholarCross Ref
Pier Stanislao Paolucci, Roberto Ammendola, Andrea Biagioni, Ottorino Frezza, Francesca Lo Cicero, Alessandro Lonardo, Michele Martinelli, Elena Pastorelli, Francesco Simula, and Piero Vicini. Power, energy and speed of embedded and server multi-cores applied to distributed simulation of spiking neural networks: ARM in NVIDIA Tegra vs Intel Xeon quad-cores. CoRR, abs/1505.03015, 2015.Google Scholar
Haohuan Fu, Junfeng Liao, Jinzhe Yang, Lanning Wang, Zhenya Song, Xiaomeng Huang, Chao Yang, Wei Xue, Fangfang Liu, and Fangli Qiao. The sunway taihulight supercomputer: system and applications. Science China Information Sciences, 59(7):072001, 2016.Google ScholarCross Ref
Keley Mohammad, Khademzadeh Ahmad, and Hosseinzadeh Mehdi. Efficient mapping algorithm on mesh-based NoCs in terms of cellular learning automata. Int. Arab J. Inf. Technol., 16(2):312--322, 2019.Google Scholar
Hossein Farrokhbakht, Hadi Mardani Kamali, and Shaahin Hessabi. SMART: A scalable mapping and routing technique for power-gating in noc routers. In IEEE/ACM International Symposium on Networks-on-Chip (NOCS), volume 15, pages 1--8, 2017.Google Scholar
Abbas Dehghani and Keyvan RahimiZadeh. Design and performance evaluation of mesh-of-tree-based hierarchical wireless network-on-chip for multicore systems. Journal of Parallel and Distributed Computing, 123:100--117, 2019.Google ScholarCross Ref
Bishnoi Rimpy, Laxmi Vijay, Manoj Singh Gaur, and Mark Zwolinski. Resilient routing implementation in 2D mesh NoC. Microelectronics Reliability, 56:189--201, 2016.Google ScholarCross Ref
John Kim, James Balfour, and William Dally. Flattened butterfly topology for on-chip networks. In IEEE/ACM International Symposium on Microarchitecture (MICRO), volume 6, pages 172--182, 2007.Google ScholarCross Ref
John Kim, Wiliam J Dally, Steve Scott, and Dennis Abts. Technology-driven, highly-scalable dragonfly topology. In International Symposium on Computer Architecture (ISCA), pages 77--88, 2008.Google ScholarDigital Library
Hesam Shabani and Xiaochen Guo. Cluscross: a new topology for silicon interposer-based Network-on-Chip. In IEEE/ACM International Symposium on Networks-on-Chip, volume 7, pages 1--8, 2019.Google ScholarDigital Library
Shpiner Alexander, Haramaty Zachy, Eliad Saar, Zdornov Vladimir, Gafni Barak, and Zahavi Eitan. Dragonfly+: Low cost topology for scaling datacenters. In IEEE International Workshop on High-Performance Interconnection Networks in the Exascale and Big-Data Era (HiPINEB), pages 1--8, 2017.Google Scholar
Nathan L. Binkert, Bradford M. Beckmann, Gabriel Black, Steven K. Reinhardt, Ali G. Saidi, Arkaprava Basu, Joel Hestness, Derek Hower, Tushar Krishna, Somayeh Sardashti, Rathijit Sen, Korey Sewell, Muhammad Shoaib Bin Altaf, Nilay Vaish, Mark D. Hill, and David A. Wood. The gem5 simulator. SIGARCH Computer Architecture News, 39(2):1--7, 2011.Google ScholarDigital Library
Niket Agarwal, Tushar Krishna, Li-Shiuan Peh, and Niraj K. Jha. GARNET: A detailed on-chip network model inside a full-system simulator. In IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pages 33--42, 2009.Google ScholarCross Ref
Ali Dorri, Salil S Kanhere, and Raja Jurdak. MOF-BC: A memory optimized and flexible blockchain for large scale networks. Future Generation Computer Systems, 92:357--373, 2019.Google ScholarCross Ref
Akbar Sharifi, Emre Kultursay, Mahmut T. Kandemir, and Chita R. Das. Addressing end-to-end memory access latency in noc-based multicores. In IEEE/ACM International Symposium on Microarchitecture (MICRO), pages 294--304, 2012.Google ScholarDigital Library
Amro Awad and Yan Solihin. STM: cloning the spatial and temporal memory access behavior. In IEEE International Symposium on High Performance Computer Architecture (HPCA), pages 237--247, 2014.Google ScholarCross Ref
Bei Zhou, Yongzhong Huang, Jinchen Xu, Shaozhong Guo, and Hongyuan Qi. Memory latency optimizations for the elementary functions on the sunway architecture. The Journal of Supercomputing, 75(7):3917--3944, 2019.Google ScholarDigital Library
Shuang Wang, Jianzhong Huang, Xiao Qin, Qiang Cao, and Changsheng Xie. WPS: A workload-aware placement scheme for erasure-coded in-memory stores. In International Conference on Networking, Architecture, and Storage (NAS), pages 1--10, 2017.Google ScholarCross Ref
Shixiong Qi, Huaxi Gu, Haibo Zhang, and Yawen Chen. Testudo: A low latency and High-Efficient Memory-Centric Network using optical interconnect. In IEEE Global Communications Conference (GLOBECOM), pages 1--6, 2017.Google ScholarCross Ref
Subodha Charles, Alif Ahmed, Ümit Y. Ogras, and Prabhat Mishra. Efficient cache reconfiguration using machine learning in NoC-based Many-Core CMPs. ACM Transactions on Design Automation of Electronic Systems, 24(6):60:1--60:23, 2019.Google ScholarDigital Library
Cheng Zhuo, Kassan Unda, Yiyu Shi, and Wei-Kai Shih. From layout to system: Early stage power delivery and architecture co-exploration. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 38(7):1291--1304, 2018.Google ScholarCross Ref
Ye Yu and Chen Qian. Space Shuffle: A scalable, flexible, and high-bandwidth data center network. In IEEE International Conference on Network Protocols (ICNP), pages 13--24, 2014.Google Scholar
Dennis Abts, Natalie D. Enright Jerger, John Kim, Dan Gibson, and Mikko H. Lipasti. Achieving predictable performance through better memory controller placement in many-core CMPs. In International Symposium on Computer Architecture (ISCA), pages 451--461, 2009.Google ScholarDigital Library
Junwhan Ahn, Sungpack Hong, Sungjoo Yoo, Onur Mutlu, and Kiyoung Choi. A scalable processing-in-memory accelerator for parallel graph processing. In International Symposium on Computer Architecture, pages 105--117, 2015.Google ScholarDigital Library
Matthew Schuchhardt, Abhishek Das, Nikos Hardavellas, Gokhan Memik, and Alok N. Choudhary. The impact of dynamic directories on multicore interconnects. IEEE Computer, 46(10):32--39, 2013.Google ScholarDigital Library
Venkata Yaswanth Raparti and Sudeep Pasricha. RAPID: memory-aware NoC for latency optimized GPGPU architectures. IEEE Transactions on Multi-Scale Computing Systems, 4(4):874--887, 2018.Google ScholarCross Ref
Manish Gupta, Vilas Sridharan, David Roberts, Andreas Prodromou, Ashish Venkat, Dean M. Tullsen, and Rajesh K. Gupta. Reliability-aware data placement for heterogeneous memory architecture. In IEEE International Symposium on High Performance Computer Architecture HPCA, pages 583--595, 2018.Google ScholarCross Ref
Roopak Sinha, Barry Dowdeswell, Gulnara Zhabelova, and Valeriy Vyatkin. TORUS: scalable requirements traceability for large-scale cyber-physical systems. ACM Transactions on Cyber-Physical Systems, 3(2):15:1--15:25, 2019.Google ScholarDigital Library
Jan Heisswolf, Simon Bischof, Michael Rückauer, and Jürgen Becker. Efficient memory access in 2D Mesh NoC architectures using high bandwidth routers. In Symposium on Integrated Circuits and Systems Design (SBCCI), pages 1--6, 2013.Google ScholarCross Ref

Index Terms

A Physical-Aware Framework for Memory Network Design Space Exploration
1. Computer systems organization
  1. Dependable and fault-tolerant systems and networks
    1. Redundancy
  2. Embedded and cyber-physical systems
    1. Embedded systems
2. Networks
  1. Network properties
    1. Network reliability

Recommendations

Automatic design planning and exploration of vlsi systems
Read More
UTPlaceF: A routability-driven FPGA placer with physical and congestion aware packing
2016 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
FPGA packing and placement without routability consideration could lead to unroutable results for high-utilization designs. Conventional FPGA packing and placement approaches are shown to have severe difficulties to yield good routability. In this paper, ...
Read More
CARAM: A Content-Aware Hybrid PCM/DRAM Main Memory System Framework
Network and Parallel Computing
Abstract
The emergence of Phase-Change Memory (PCM) provides opportunities for directly connecting persistent memory to main memory bus. While PCM achieves high read throughput and low standby power, the critical concerns are its poor write performance and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference
January 2021
930 pages
ISBN:9781450379991
DOI:10.1145/3394885

Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 29 January 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- research-article
- Research
- Refereed limited
Conference

Acceptance Rates
ASPDAC '21 Paper Acceptance Rate111of368submissions,30%Overall Acceptance Rate466of1,454submissions,32%
More
Upcoming Conference
ASPDAC '25

Sponsor:

sigda

30th Asia and South Pacific Design Automation Conference

January 20 - 23, 2025

Tokyo , Japan
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 127
  Total Downloads
- Downloads (Last 12 months)11
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

A Physical-Aware Framework for Memory Network Design Space Exploration

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic design planning and exploration of vlsi systems

UTPlaceF: A routability-driven FPGA placer with physical and congestion aware packing

CARAM: A Content-Aware Hybrid PCM/DRAM Main Memory System Framework

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

A Physical-Aware Framework for Memory Network Design Space Exploration

ASPDAC '21: Proceedings of the 26th Asia and South Pacific Design Automation Conference

ABSTRACT

References

Cited By

Index Terms

Recommendations

Automatic design planning and exploration of vlsi systems

UTPlaceF: A routability-driven FPGA placer with physical and congestion aware packing

CARAM: A Content-Aware Hybrid PCM/DRAM Main Memory System Framework

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media