Optimization of a Linked Cache Coherence Protocol for Scalable Manycore Coherence

Fernández-Pascual, Ricardo; Ros, Alberto; Acacio, Manuel E.

doi:10.1007/978-3-319-30695-7_8

Optimization of a Linked Cache Coherence Protocol for Scalable Manycore Coherence

Ricardo Fernández-Pascual¹⁹,
Alberto Ros¹⁹ &
Manuel E. Acacio¹⁹

Conference paper

1333 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9637))

Abstract

Despite having been quite popular during the 1990 s because of their important advantages, linked cache coherence protocols have gone completely unnoticed in the multicore wave. In this work we bring them in the spotlight, demonstrating that they are a good alternative to other solutions being proposed nowadays. In particular, we consider in this work the case for a simply-linked list-based cache coherence protocol and propose two techniques, namely Concurrent Replacements (CR) and Opportunistic Replacements (OR), aimed at palliating the negative effects of replacements of clean data. Through detailed simulations of several SPLASH-2 and PARSEC applications, we demonstrate that, armed with CR and OR, simply-linked list-based protocols are able to offer the performance of a non-scalable bit-vector directory at the same time that scalability to larger core counts is preserved.

This work has been supported by the Spanish MINECO, as well as European Commission FEDER funds, under grants “TIN2012-38341-C04-03” and “TIN2015-66972-C5-3-R”, and by the Fundación Séneca-Agencia de Ciencia y Tecnología de la Región de Murcia under grant “19295/PI/14”.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
Without generality loss, we assume private L1 caches in each core and an inclusive, shared L2 cache distributed between them.
2.
In the case of clean shared replacements, the writeback buffer only needs to store the sharing information, not the data. Due to its very small size in List, this information may alternatively be kept in a miss status holding register (MSHR) or similar structure.

References

Agarwal, N., Krishna, T., Peh, L.S., Jha, N.K.: GARNET: a detailed on-chip network model inside a full-system simulator. In: IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), pp. 33–42, April 2009
Google Scholar
Alameldeen, A.R., Wood, D.A.: Variability in architectural simulations of multi-threaded workloads. In: 9th International Symposium on High-Performance Computer Architecture (HPCA), pp. 7–18, February 2003
Google Scholar
Bienia, C., Kumar, S., Singh, J.P., Li, K.: The PARSEC benchmark suite: characterization and architectural implications. In: 17th International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 72–81, October 2008
Google Scholar
Conway, P., Kalyanasundharam, N., Donley, G., Lepak, K., Hughes, B.: Blade computing with the AMD Opteron™ processor (“Magny Cours”). In: 21st HotChips Symposium, August 2009
Google Scholar
Cuesta, B., Ros, A., Gómez, M.E., Robles, A., Duato, J.: Increasing the effectiveness of directory caches by deactivating coherence for private memory blocks. In: 38th International Symposium on Computer Architecture (ISCA), pp. 93–103, June 2011
Google Scholar
Culler, D.E., Singh, J.P., Gupta, A.: Parallel Computer Architecture: a Hardware/Software Approach. Morgan Kaufmann Inc., Burlington (1999)
Google Scholar
Demetriades, S., Cho, S.: Stash directory: a scalable directory for many-core coherence. In: 20th International Symposium on High-Performance Computer Architecture (HPCA), pp. 177–188, February 2014
Google Scholar
Fang, L., Liu, P., Hu, Q., Huang, M.C., Jiang, G.: Building expressive, area-efficient coherence directories. In: 22nd International Conference on Parallel Architectures and Compilation Techniques (PACT), pp. 299–308, September 2013
Google Scholar
Fernández-Pascual, R., Ros, A., Acacio, M.E.: Characterization of a list-based directory cache coherence protocol for manycore CMPs. In: Lopes, L., Žilinskas, J., Costan, A., Cascella, R.G., Kecskemeti, G., Jeannot, E., Cannataro, M., Ricci, L., Benkner, S., Petit, S., Scarano, V., Gracia, J., Hunold, S., Scott, S.L., Lankes, S., Lengauer, C., Carretero, J., Breitbart, J., Alexander, M. (eds.) Euro-Par 2014, Part II. LNCS, vol. 8806, pp. 254–265. Springer, Heidelberg (2014)
Google Scholar
James, D., Laundrie, A., Gjessing, S., Sohi, G.: Scalable coherent interface. Computer 23(6), 74–77 (1990)
Article Google Scholar
Lovett, T., Clapp, R.: STiNG: a CC-NUMA computer system for the commercial marketplace. In: 23rd International Symposium on Computer Architecture (ISCA), pp. 308–317, June 1996
Google Scholar
Luk, C.K., Cohn, R., Muth, R., Patil, H., Klauser, A., Lowney, G., Wallace, S., Reddi, V.J., Hazelwood, K.: Pin: building customized program analysis tools with dynamic instrumentation. In: 2005 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pp. 190–200, June 2005
Google Scholar
Martin, M.M., Sorin, D.J., Beckmann, B.M., Marty, M.R., Xu, M., Alameldeen, A.R., Moore, K.E., Hill, M.D., Wood, D.A.: Multifacet’s general execution-driven multiprocessor simulator (GEMS) toolset. Comput. Archit. News 33(4), 92–99 (2005)
Article Google Scholar
Monchiero, M., Ahn, J.H., Falcón, A., Ortega, D., Faraboschi, P.: How to simulate 1000 cores. Comput. Archit. News 37(2), 10–19 (2009)
Article Google Scholar
Sanchez, D., Kozyrakis, C.: SCD: a scalable coherence directory with flexible sharer set encoding. In: 18th International Symposium on High-Performance Computer Architecture (HPCA), pp. 129–140, February 2012
Google Scholar
Thapar, M., Delagi, B.: Stanford distributed-directory protocol. Computer 23(6), 78–80 (1990)
Article Google Scholar
Thekkath, R., Singh, A.P., Singh, J.P., John, S., Hennessy, J.L.: An evaluation of a commercial CC-NUMA architecture: The CONVEX Exemplar SPP1200. In: 11th International Symposium on Parallel Processing (IPPS), pp. 8–17, April 1997
Google Scholar
Woo, S.C., Ohara, M., Torrie, E., Singh, J.P., Gupta, A.: The SPLASH-2 programs: Characterization and methodological considerations. In: 22nd International Symposium on Computer Architecture (ISCA), pp. 24–36, June 1995
Google Scholar

Download references

Author information

Authors and Affiliations

Dpto. de Ingeniería y Tecnología de Computadores, Universidad de Murcia, Murcia, Spain
Ricardo Fernández-Pascual, Alberto Ros & Manuel E. Acacio

Authors

Ricardo Fernández-Pascual
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Ros
View author publications
You can also search for this author in PubMed Google Scholar
Manuel E. Acacio
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ricardo Fernández-Pascual .

Editor information

Editors and Affiliations

Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Frank Hannig
Faculty of Engineering (FEUP), University of Porto, Porto, Portugal
João M. P. Cardoso
Universität zu Lübeck, Lübeck, Germany
Thilo Pionteck
Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Dietmar Fey
Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Wolfgang Schröder-Preikschat
Friedrich-Alexander University Erlangen-Nürnberg, Erlangen, Germany
Jürgen Teich

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Fernández-Pascual, R., Ros, A., Acacio, M.E. (2016). Optimization of a Linked Cache Coherence Protocol for Scalable Manycore Coherence. In: Hannig, F., Cardoso, J.M.P., Pionteck, T., Fey, D., Schröder-Preikschat, W., Teich, J. (eds) Architecture of Computing Systems – ARCS 2016. ARCS 2016. Lecture Notes in Computer Science(), vol 9637. Springer, Cham. https://doi.org/10.1007/978-3-319-30695-7_8

Download citation

DOI: https://doi.org/10.1007/978-3-319-30695-7_8
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30694-0
Online ISBN: 978-3-319-30695-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics