research-article

Optimal memory controller placement for chip multiprocessor

Authors:
Thomas Canhao Xu

Turku Center for Computer Science (TUCS), Turku, Finland

Turku Center for Computer Science (TUCS), Turku, Finland
View Profile

,
Pasi Liljeberg

University of Turku, Turku, Finland

University of Turku, Turku, Finland
View Profile

,
Hannu Tenhunen

University of Turku, Turku, Finland

University of Turku, Turku, Finland
View Profile

CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesisOctober 2011Pages 217–226https://doi.org/10.1145/2039370.2039405

Published:09 October 2011Publication History

CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

Pages 217–226

ABSTRACT

In this paper, we analyze and compare different placements of memory controllers for Chip Multiprocessors (CMPs). As the number of cores increases, Network-on-Chip (NoC) based architectures are proposed as a promising interconnect technique for CMP. The memory bandwidth between on-chip components and off-chip memory has become a critical problem. The integration of more memory controllers on chip is one feasible way to solve this problem. However, the physical location of memory controllers in a mesh-based NoC have a significant impact on system performance. We investigate the placement of multiple memory controllers in an 8x8 NoC. Several metrics have been analyzed. An optimal memory controller placement is found and evaluated. We propose a generic "divide and conquer" method for solving the placement of memory controllers in large NoCs. By using applications selected from SPLASH-2, PARSEC, TPC and SPEC as benchmarks, it is shown that the average network latency, average link utilization and performance power product in our optimal placement are reduced by 7.63%, 10.44% and 13.94% compared with the conventional two-sides placement, respectively. This paper gives a solid theoretical foundation to future CMP design.

References

D. Abts, N. D. E. Jerger, J. Kim, D. Gibson, and M. H. Lipasti. Achieving predictable performance through better memory controller placement in many-core cmps. In Proc. of the 36th ISCA, pages 451--461, June 2009. Google ScholarDigital Library
AMD. The amd opteron 6000 series platform, May 2010. http://www.amd.com/-us/products/server/processors/6000-series-platform/pages/6000-series-platform.aspx.Google Scholar
S. I. Association. The international technology roadmap for semiconductors (itrs), 2007. http://www.itrs.net/Links/2007ITRS/Home2007.htm.Google Scholar
M. Awasthi, D. W. Nellans, K. Sudan, R. Balasubramonian, and A. Davis. Handling the problems and opportunities posed by multiple on-chip memory controllers. In Proceedings of the 19th PACT, pages 319--330, New York, NY, USA, 2010. ACM. Google ScholarDigital Library
C. Bienia, S. Kumar, J. P. Singh, and K. Li. The parsec benchmark suite: characterization and architectural implications. In Proceedings of the 17th PACT, pages 72--81, October 2008. Google ScholarDigital Library
T. Corporation, August 2010. http://www.tilera.com.Google Scholar
W. J. Dally and B. Towles. Route packets, not wires: on-chip inteconnection networks. In Proceedings of the 38th DAC, pages 684--689, June 2001. Google ScholarDigital Library
H. Global. Ddr 2 memory controller ip core for fpga and asic, June 2010. http://www.hitechglobal.com/ipcores/ddr2controller.htm.Google Scholar
IBM. Ibm power 7 processor. In Hot chips 2009, August 2009.Google Scholar
Intel. Intel core i7 processor extreme edition and intel core i7 processor datasheet, volume 1, December 2008. http://download.intel.com/design/processor/datashts-/320834.pdf.Google Scholar
Intel. Single-chip cloud computer, May 2010. http://techresearch.intel.com/-articles/Tera-Scale/1826.htm.Google Scholar
A. Kahng, B. Li, L.-S. Peh, and K. Samadi. Orion 2.0: A fast and accurate noc power and area model for early-stage design space exploration. In DATE 2009, pages 423 --428, 2009. Google ScholarDigital Library
C. Kim, D. Burger, and S. W. Keckler. An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In ACM SIGPLAN, pages 211--222, October 2002. Google ScholarDigital Library
Y. Kim, D. Han, O. Mutlu, and M. Harchol-Balter. Atlas: A scalable and high-performance scheduling algorithm for multiple memory controllers. In Proceedings of the 16th HPCA, pages 1 --12, 2010.Google Scholar
J. W. Lee, M. C. Ng, and K. Asanovic. Globally-synchronized frames for guaranteed quality-of-service in on-chip networks. In Proceedings of the 35th ISCA, pages 89--100, Washington, DC, USA, 2008. IEEE Computer Society. Google ScholarDigital Library
P. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 35(2):50--58, February 2002. Google ScholarDigital Library
O. Mutlu and T. Moscibroda. Parallelism-aware batch scheduling: Enhancing both performance and fairness of shared dram systems. SIGARCH Comput. Archit. News, 36(3):63--74, 2008. Google ScholarDigital Library
K. J. Nesbit, N. Aggarwal, J. Laudon, and J. E. Smith. Fair queuing memory systems. In MICRO 39, pages 208--222, Washington, DC, USA, 2006. IEEE Computer Society. Google ScholarDigital Library
A. Patel and K. Ghose. Energy-efficient mesi cache coherence with pro-active snoop filtering for multicore microprocessors. In Proceeding of the 13th ISLPED, pages 247--252, August 2008. Google ScholarDigital Library
T. Shyamkumar, M. Naveen, A. J. Ho, and J. N. P. Cacti 5.1. Technical Report HPL-2008--20, HP Labs.Google Scholar
H. Sullivan and T. R. Bashkow. A large scale, homogeneous, fully distributed parallel machine. In Proceedings of the 4th ISCA, pages 105--117, March 1977. Google ScholarDigital Library
TPC. Tpc-h decision support benchmark. http://www.tpc.org/tpch/.Google Scholar
M. Tremblay and S. Chaudhry. A third-generation 65nm 16-core 32-thread plus 32-scout-thread cmt sparc processor. In ISSCC 2008, pages 82--83, February 2008.Google ScholarCross Ref
S. Vangal, J. Howard, G. Ruhl, S. Dighe, H. Wilson, J. Tschanz, D. Finan, P. Iyer, A. Singh, T. Jacob, S. Jain, S. Venkataraman, Y. Hoskote, and N. Borkar. An 80-tile 1.28tflops network-on-chip in 65nm cmos. In ISSCC 2007. Digest of Technical Papers. IEEE International, pages 98--589, Feb. 2007.Google ScholarCross Ref
D. Wentzlaff, P. Griffin, H. Hoffmann, L. Bao, B. Edwards, C. Ramey, M. Mattina, C.-C. Miao, J. Brown, and A. Agarwal. On-chip interconnection architecture of the tile processor. Micro, IEEE, 27(5):15 --31, sept.-oct. 2007. Google ScholarDigital Library
S. C. Woo, M. Ohara, E. Torrie, J. P. Singh, and A. Gupta. The splash-2 programs: Characterization and methodological considerations. In Proceedings of the 22nd ISCA, pages 24--36, June 1995. Google ScholarDigital Library

Index Terms

Optimal memory controller placement for chip multiprocessor
1. Hardware
  1. Hardware test
    1. Test-pattern generation and fault simulation
  2. Integrated circuits

Recommendations

Achieving predictable performance through better memory controller placement in many-core CMPs

In the near term, Moore's law will continue to provide an increasing number of transistors and therefore an increasing number of on-chip cores. Limited pin bandwidth prevents the integration of a large number of memory controllers on-chip. With many ...
Read More
Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Multicore platforms are emerging trends in the design of System-on-Chips (SoCs). Interconnect fabrics for these multicore SoCs play a crucial role in achieving the target performance. The Network-on-Chip (NoC) paradigm has been proposed as a promising ...
Read More
Optimal placement of vertical connections in 3D Network-on-Chip

Due to technological limitations, manufacturing yield of vertical connections (Through Silicon Vias, TSVs) in 3D Networks-on-Chip (NoC) decreases rapidly when the number of TSVs grows. The adoption of 3D NoC design depends on the performance and ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis
October 2011
402 pages
ISBN:9781450307154
DOI:10.1145/2039370
Program Chairs:
Robert P. Dick
University of Michigan
,
Jan Madsen
Technical University of Denmark
Copyright © 2011 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 October 2011
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
divide and conquer
memory controller
multicore
resource placement
Qualifiers
- research-article
Conference

Acceptance Rates
Overall Acceptance Rate280of864submissions,32%
Upcoming Conference
ESWEEK '24

Sponsor:

sigbed

sigbed

sigbed

Twentieth Embedded Systems Week

September 29 - October 4, 2024

Raleigh , NC , USA
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 18
  Total Citations
  View Citations
- 347
  Total Downloads
- Downloads (Last 12 months)5
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Optimal memory controller placement for chip multiprocessor

CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Achieving predictable performance through better memory controller placement in many-core CMPs

Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Optimal placement of vertical connections in 3D Network-on-Chip

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Optimal memory controller placement for chip multiprocessor

CODES+ISSS '11: Proceedings of the seventh IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis

ABSTRACT

References

Cited By

Index Terms

Recommendations

Achieving predictable performance through better memory controller placement in many-core CMPs

Scalable Hybrid Wireless Network-on-Chip Architectures for Multicore Systems

Optimal placement of vertical connections in 3D Network-on-Chip

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media