PS-Cache: an energy-efficient cache design for chip multiprocessors

Valls, Joan J.; Ros, Alberto; Sahuquillo, Julio; Gomez, Maria E.

doi:10.1007/s11227-014-1288-5

PS-Cache: an energy-efficient cache design for chip multiprocessors

Published: 13 September 2014

Volume 71, pages 67–86, (2015)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Joan J. Valls¹,
Alberto Ros²,
Julio Sahuquillo¹ &
…
Maria E. Gomez¹

274 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

Power consumption has become a major design concern in current high-performance chip multiprocessors, and this problem exacerbates with the number of core counts. A significant fraction of the total power budget is often consumed by on-chip caches, thus important research has focused on reducing energy consumption in these structures. To enhance performance, on-chip caches are being deployed with a high associativity degree. Consequently, accessing concurrently all the ways in the cache set is costly in terms of energy. This paper presents the PS-Cache architecture, an energy-efficient cache design that reduces the number of accessed ways without hurting the performance. The PS-Cache takes advantage of the private-shared knowledge of the referenced block to reduce energy by accessing only those ways holding the kind of block looked up. Experimental results show that, on average, the PS-Cache architecture can reduce the dynamic energy consumption of L1 and L2 caches by \(22\) and \(40\,\%\), respectively.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

An Energy-Efficient Dual-Level Cache Architecture for Chip Multiprocessors

DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache

Article 04 July 2014

CaPPS: cache partitioning with partial sharing for multi-core embedded systems

Article 04 November 2015

Notes

Experimental environment, system parameters, protocols and cache hierarchy are described in Section 5.
If a TLB miss occurs, after solving the miss the corresponding entry is in the TLB.

References

Balasubramonian R, Jouppi NP, Muralimanohar N (2011) Multi-core cache hierarchies. In: Synthesis lectures on computer architecture. Morgan & Claypool Publishers, San Rafael
Hennessy JL, Patterson DA (2011) Computer architecture, fifth edition: a quantitative approach, 5th edn. Morgan Kaufmann Publishers Inc., San Francisco
Sinharoy B, Kalla R, Starke WJ, Le HQ, Cargnoni R, Van Norstrand JA, Ronchetti BJ, Stuecheli J, Leenstra J, Guthrie GL, Nguyen DQ, Blaner B, Marino CF, Retter E, Williams P (2011) IBM POWER7 multicore server processor. IBM J Res Dev 5(3):1:1-1:29 doi:10.1147/JRD.2011.2127330
Kaxiras S, Hu Z, Martonosi M (2011) 28th International symposium on computer architecture (ISCA), pp 240–251
Flautner K, Kim NS, Martin S, Blaauw D, Kaxiras TM, Hu Z, Martonosi M (2002) 29th International symposium on computer architecture (ISCA), pp 148–157
Ghosh M, Özer E, Ford S, Biles S, Lee HHS (2009) International symposium on low power electronics and design (ISLPED), pp 165–170
Calder B, Grunwald D (1996) 2nd international symposium on high-performance computer architecture (HPCA) (1996), pp 244–253
Hardavellas N, Ferdman M, Falsafi B, Ailamaki A (2009) 36th international symposium on computer architecture (ISCA), pp 184–195
Cuesta B, Ros A, Gómez ME, Robles A, Duato J (2011) 38th international symposium on computer architecture (ISCA), pp 93–103
Pugsley SH, Spjut JB, Nellans DW, Balasubramonian R (2010) 19th international conference on parallel architectures and compilation techniques (PACT), pp 465–476
Hossain H, Dwarkadas S, Huang MC (2011) 20th international conference on parallel architectures and compilation techniques (PACT), pp 45–55
Kim D, Kim JAJ, Huh J (2010) 19th international conference on parallel architectures and compilation techniques (PACT), pp 111–122
Ros A, Kaxiras S (2012) 21st international conference on parallel architectures and compilation techniques (PACT), pp 241–252
Sundararajan KT, Porpodas V, Jones TM, Topham NP, Franke B (2012) 18th international symposium on high-performance computer architecture (HPCA), pp 311–322
Agarwal N, Peh LS, Jha NK (2009) 15th international symposium on high-performance computer architecture (HPCA), pp 67–78
Cantin JF, Smith JE, Lipasti MH, Moshovos A, Falsafi B (2006) Coarse-grain coherence tracking: región scout and region coherence arrays. IEEE Micro 26(1):70–95
Ferdman M, Lotfi-Kamran P, Balet K, Falsafi B (2011) 17th international symposium on high-performance computer architecture (HPCA), pp 169–180
Zebchuk J, Srinivasan V, Qureshi MK, Moshovos A (2009) 42nd IEEE/ACM international symposium on microarchitecture (MICRO), pp 423–434
Powell M, Hyun Yang S, Falsafi B, Roy K, Vijaykumar TN (2000) International symposium on low power electronics and design (ISLPED), pp 90–95
Albonesi DH (1999) 32nd IEEE/ACM international symposium on microarchitecture (MICRO), pp 248–259
Zhang C, Vahid F, Yang J, Najjar W (2005) A way-halting cache for low-energy high-performance systems. ACM Transactions on Architecture and Code Optimization. 2(1):34–54
Ghosh M, Özer E, Biles S, Lee HHS (2006) 19th international conference on architecture of computing systems (ARCS), pp 283–297
Lee J, Hong S, Kim S (2011) 17th international symposium on low power electronics and design (ISLPED), pp 85–90
Kedzierski K, Cazorla FJ, Gioiosa R, Buyuktosunoglu A, Valero M (2010) 2nd international forum on next-generation multicore/manycore technologies, pp 1–12
Alouani I, Niar S, Kurdahi F, Abid M (2012) 23rd IEEE international symposium on rapid system prototyping (RSP), pp 44–48
Meng J, Skadron K (2009) International conference on computer design (ICCD), pp 282–288
Li Y, Abousamra A, Melhem R, Jones AK (2010) 19th international conference on parallel architectures and compilation techniques (PACT), pp 501–512
Li Y, Melhem RG, Jones AK (2012) 21st international conference on parallel architectures and compilation techniques (PACT), pp 231–240
Alisafaee M (2012) 45th IEEE/ACM international symposium on microarchitecture (MICRO), pp 341–350
Jiang G, Fen D, Tong L, Xiang L, Wang C, Chen T (2009) 8th international symposium on advanced parallel processing technologies. Springer, Berlin, pp 123–133
Sundararajan K, Jones T, Topham N (2013) IEEE 31st international conference on computer design (ICCD), pp 294–301
Valls JJ, Ros A, Sahuquillo J, Gómez ME, Duato J (2012) 21st international conference on parallel architectures and compilation techniques (PACT), pp 451–452
Ros A, Cuesta B, Gómez ME, Robles A, Duato J (2013) 42nd international conference on parallel processing (ICPP), pp 562–571
Jacob B, Ng S, Wang D (2007) Memory systems: cache, DRAM, disk, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco
Patterson DA, Hennessy JL (2008) Computer organization and design: the hardware/software interface. The Morgan Kaufmann Series in Computer Architecture and Design, 4th edn. Morgan Kaufmann Publishers Inc., San Francisco
Magnusson PS, Christensson M, Eskilson J, Forsgren D, Hallberg G, Hogberg J, Larsson F, Moestedt A, Werner B (2002) Simics: a full system simulation platform. IEEE Comput 35(2):50–58
Martin MM, Sorin DJ, Beckmann BM, Marty MR, Xu M, Alameldeen AR, Moore KE, Hill MD, Wood DA (2005) Multifacet’s general execution-driven multiprocessor simulator GEMS toolset. Comput Archit News 33(4):92–99
Agarwal N, Krishna T, Peh LS, Jha NK (2009) IEEE international symposium on performance analysis of systems and software (ISPASS), pp 33–42
Muralimanohar N, Balasubramonian R, Jouppi NP (2009) Cacti 6.0. Tech. Rep. HPL-2009-85, HP Labs
Woo SC, Ohara M, Torrie E, Singh JP, Gupta A (1995) 22nd international symposium on computer architecture (ISCA), pp 24–36
Li ML, Sasanka R, Adve SV, Chen YK, Debes E (2005) International symposium on workload characterization, pp 34–45
Bienia C, Kumar S, Singh JP, Li K (2008) 17th international conference on parallel architectures and compilation techniques (PACT), pp 72–81

Download references

Acknowledgments

This work has been jointly supported by the MINECO and European Commission (FEDER funds) under the project TIN2012-38341-C04-01 and the Fundaci’on Seneca-Agencia de Ciencia y Tecnolo’ia de la Regi’on de Murcia under the project J’ovenes L’ideres en Investigaci’on 18956/JLI/13.

Author information

Authors and Affiliations

Department of Computing Engineering, Universitat Politècnica de València, Valencia, Spain
Joan J. Valls, Julio Sahuquillo & Maria E. Gomez
Department of Computer Engineering, Universidad de Murcia, Murcia, Spain
Alberto Ros

Authors

Joan J. Valls
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Ros
View author publications
You can also search for this author in PubMed Google Scholar
Julio Sahuquillo
View author publications
You can also search for this author in PubMed Google Scholar
Maria E. Gomez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Joan J. Valls.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Valls, J.J., Ros, A., Sahuquillo, J. et al. PS-Cache: an energy-efficient cache design for chip multiprocessors. J Supercomput 71, 67–86 (2015). https://doi.org/10.1007/s11227-014-1288-5

Download citation

Published: 13 September 2014
Issue Date: January 2015
DOI: https://doi.org/10.1007/s11227-014-1288-5

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PS-Cache: an energy-efficient cache design for chip multiprocessors

Abstract

Access this article

Similar content being viewed by others

An Energy-Efficient Dual-Level Cache Architecture for Chip Multiprocessors

DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache

CaPPS: cache partitioning with partial sharing for multi-core embedded systems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

PS-Cache: an energy-efficient cache design for chip multiprocessors

Abstract

Access this article

Similar content being viewed by others

An Energy-Efficient Dual-Level Cache Architecture for Chip Multiprocessors

DEAM: Decoupled, Expressive, Area-Efficient Metadata Cache

CaPPS: cache partitioning with partial sharing for multi-core embedded systems

Notes

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation