Evaluation of Low-Overhead Organizations for the Directory in Future Many-Core CMPs

Ros, Alberto; Acacio, Manuel E.

doi:10.1007/978-3-642-21878-1_12

Evaluation of Low-Overhead Organizations for the Directory in Future Many-Core CMPs

Alberto Ros²⁶ &
Manuel E. Acacio²⁷

Conference paper

1560 Accesses
2 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6586))

Abstract

If current trends continue, today’s small-scale general-purpose CMPs will soon be replaced by multi-core architectures integrating tens or even hundreds of cores on-chip. Most likely, some of these many-core CMPs will implement the hardware-managed, implicitly-addressed, coherent caches memory model. Cache coherence in these designs will be probably maintained through a directory-based cache coherence protocol implemented in hardware. The organization of the directory structure will be a key design point due to the requirements in area that it will pose. In this work, we study the effects on performance, network traffic and area that the use of compressed sharing codes for the directory will have in many-core CMPs. In particular, we select two compressed sharing codes previously proposed in the context of large-scale shared-memory multiprocessors that have very small area requirements. Simulation results of 32-core CMPs show that degradations of up to 32% in performance and 350% in network traffic are experienced. Additionally, since some proposals for efficient multicast support in on-chip networks have recently appeared, we also consider the case of using this support in combination with the compressed sharing codes. Unfortunately, we found that multicast support is not enough to remove all the performance degradation introduced by the compressed sharing codes and barely can reduce network traffic.

We would like to thank anonymous reviewers for their suggestions. This research was supported by the Spanish MEC and MICINN, as well as European Commission FEDER funds, under Grants CSD2006-00046 and TIN2009-14475-C04, and PROMETEO from Generalitat Valenciana (GVA) under Grant PROMETEO/2008/060.

Download to read the full chapter text

Chapter PDF

References

Borkar, S.: Thousand core chips: A technology perspective. In: 44th Annual Design Automation Conference, pp. 746–749 (2007)
Google Scholar
Taylor, M.B., Kim, J., Miller, J., et al.: The raw microprocessor: A computational fabric for software circuits and general purpose programs. IEEE Micro 22, 25–35 (2002)
Article Google Scholar
Zhang, M., Asanović, K.: Victim replication: Maximizing capacity while hiding wire delay in tiled chip multiprocessors. In: 32nd Int’l Symp. on Computer Architecture (ISCA), pp. 336–345 (2005)
Google Scholar
Intel Res.: Single-chip Cloud Computer (2010), http://techresearch.intel.com/articles/Tera-Scale/1826.htm
Leverich, J., Arakida, H., Solomatnikov, A., Firoozshahian, A., Horowitz, M., Kozyrakis, C.: Comparing memory systems for chip multiprocessors. In: 34th Int’l Symp. on Computer Architecture (ISCA), pp. 358–368 (2007)
Google Scholar
Acacio, M.E., González, J., García, J.M., Duato, J.: A new scalable directory architecture for large-scale multiprocessors. In: 7th Int’l Symp. on High-Performance Computer Architecture (HPCA), pp. 97–106 (2001)
Google Scholar
Rodrigo, S., Flich, J., Duato, J., Hummel, M.: Efficient unicast and multicast support for CMPs. In: 41st IEEE/ACM Int’l Symp. on Microarchitecture (MICRO), pp. 364–375 (2008)
Google Scholar
Kim, C., Burger, D., Keckler, S.W.: An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches. In: 10th Int. Conf. on Architectural Support for Programming Language and Operating Systems (ASPLOS), pp. 211–222 (2002)
Google Scholar
Magnusson, P.S., Christensson, M., Eskilson, J., et al.: Simics: A full system simulation platform. IEEE Computer 35, 50–58 (2002)
Article Google Scholar
Martin, M.M., Sorin, D.J., Beckmann, B.M., et al.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Computer Architecture News 33, 92–99 (2005)
Article Google Scholar
Puente, V., Gregorio, J.A., Beivide, R.: SICOSYS: An integrated framework for studying interconnection network in multiprocessor systems. In: 10th Euromicro Workshop on Parallel, Distributed and Network-based Processing, pp. 15–22 (2002)
Google Scholar
Thoziyoor, S., Muralimanohar, N., Ahn, J.H., Jouppi, N.P.: CACTI 5.1. Technical Report HPL-2008-20, HP Labs (2008)
Google Scholar
Horel, T., Lauterbach, G.: UltraSPARC-III: Designing third-generation 64-bit performance. IEEE Micro 19, 73–85 (1999)
Article Google Scholar
Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and methodological considerations. In: 22nd Int’l Symp. on Computer Architecture (ISCA), pp. 24–36 (1995)
Google Scholar
Li, M.L., Sasanka, R., Adve, S.V., Chen, Y.K., Debes, E.: The ALPBench benchmark suite for complex multimedia applications. In: Int’l Symp. on Workload Characterization, pp. 34–45 (2005)
Google Scholar
Wang, H., Peh, L.S., Malik, S.: Power-driven design of router microarchitectures in on-chip networks. In: 36th IEEE/ACM Int’l Symp. on Microarchitecture (MICRO), pp. 105–111 (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Dpto. de Informática de Sistemas y Computadores, Universidad Politécnica de Valencia, 46022, Valencia, Spain
Alberto Ros
Dpto. de Ingeniería y Tecnología de Computadores, Universidad de Murcia, 30100, Murcia, Spain
Manuel E. Acacio

Authors

Alberto Ros
View author publications
You can also search for this author in PubMed Google Scholar
Manuel E. Acacio
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

CNR, ICAR, Via P. Castellino, 111, 80131, Napoli, Italy
Mario R. Guarracino
INRIA, PIP ENS Lyon, 46 Allée d’Italie, 69364, Lyon, France
Frédéric Vivien
Scientific Computing, University of Vienna, Nordbergstr. 15/3C, 1090, Vienna, Austria
Jesper Larsson Träff
University of Catanzaro, 88100, Catanzaro, Italy
Mario Cannatoro
Dept. of Computer Science, University of Pisa, Via Tevere 17, 56122, Pisa, Italy
Marco Danelutto
Gavle Creative Media Lab, Kungsbacksvagen 47, 80632, Gavle, Sweden
Anders Hast
Dept. Math & Stat, University of Naples Parthenope, via Medina 40, 80133, Napoli, Italy
Francesca Perla
TU Dresden, Zellescher Weg 12-14, 01187, Dresden, Germany
Andreas Knüpfer
Dipartimento di Ingegneria dell’ Informazione, Seconda Università di Napoli, via Roma 29, 81031, Aversa, Italy
Beniamino Di Martino
Scaledinfra technologies GmbH, Köllnerhofgasse 3/15A, 1010, Vienna, Austria
Michael Alexander

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ros, A., Acacio, M.E. (2011). Evaluation of Low-Overhead Organizations for the Directory in Future Many-Core CMPs. In: Guarracino, M.R., et al. Euro-Par 2010 Parallel Processing Workshops. Euro-Par 2010. Lecture Notes in Computer Science, vol 6586. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21878-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-21878-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-21877-4
Online ISBN: 978-3-642-21878-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics