Lightweight dynamic partitioning for last-level cache of multicore processor on real system

Zhang, Ludan; Liu, Yi; Wang, Rui; Qian, Depei

doi:10.1007/s11227-014-1092-2

Lightweight dynamic partitioning for last-level cache of multicore processor on real system

Published: 20 January 2014

Volume 69, pages 547–560, (2014)
Cite this article

The Journal of Supercomputing Aims and scope Submit manuscript

Ludan Zhang¹,
Yi Liu¹,
Rui Wang¹ &
…
Depei Qian¹

227 Accesses
5 Citations
Explore all metrics

Abstract

With rapid development of multi/many-core processors, contention in shared cache becomes more and more serious that restricts performance improvement of parallel programs. Recent researches have employed page coloring mechanism to realize cache partitioning on real system and to reduce contentions in shared cache. However, page coloring-based cache partitioning has some side effects, one is page coloring restricts memory space that an application can allocate, from which may lead to memory pressure, another is changing cache partition dynamically needs massive page copying which will incur large overhead. To make page coloring-based cache partition more practical, this paper proposes a malloc allocator-based dynamic cache partitioning mechanism with page coloring. Memory allocated by our malloc allocator can be dynamically partitioned among different applications according to partitioning policy. Only coloring the dynamically allocated pages can remit memory pressure and reduce page copying overhead led by re-coloring compared to all-page coloring. To further alleviate the overhead, we introduce minimum distance page copying strategy and lazy flush strategy. We conduct experiments on real system to evaluate these strategies and results show that they work well for reducing cache misses and re-coloring overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems

Article 12 June 2021

An adaptive migration–replication scheme (AMR) for shared cache in chip multiprocessors

Article 26 July 2015

LPW: an efficient data-aware cache replacement strategy for Apache Spark

Article 26 December 2022

References

Lin J, Lu Q, Zhang X et al (2008) Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: Proceedings of the 14th international symposium on high performance computer architecture (HPCA-14), Salt Lake City
Soares L, Tam D, Stumm M (2008) Reducing the harmful effects of last-level cache polluters with an OS-level, software-only pollute buffer. In 41th international symposium on microarchitecture
Zhang X, Dwarkadas S, Shen K (2009) Towards practical page coloring-based multicore Cache management. In: Proceedings of the 4th ACM European conference on computer systems (EuroSys’09), pp 89–102
Taylor G, Davies P, Farmwald M (1990) The TLB sliceCa low-cost high-speed address translation mechanism. In: Proceedings of the ISCA’90, pp 355–363
Kessler RE, Hill MD (1992) Page placement algorithms for large real-indexed caches. ACM Trans Comput Syst 10(4):338–359
Article Google Scholar
Bugnion E, Anderson J, Mowry T et al (1996) Compiler-directed page coloring for multiprocessors. ACM SIGPLAN Not 31(9):244–255
Article Google Scholar
Ding X, Wang K, Zhang X (2011) ULCC: a user-level facility for optimizing shared cache performance on multicores. In: Proceedings of 16th ACM SIGPLAN annual symposium on principles and practice of parallel programming (PPoPP 2011), 12–16 Feb 2011
Lu Q, Lin J, Zhang X et al (2009) Soft-olp: improving hardware cache performance through software-controlled object-level partitioning. In: Proceedings of the 18th international conference on parallel architectures and compilation techniques (PACT), pp 246–257
Perarnau S, Tchiboukdjian M, Huard G (2011) Controlling cache utilization of hpc applications. ACM. In: Proceedings of the international conference on supercomputing, pp 295–304
perf. http://perf.wiki.kernel.org/.2011
SPEC CPU2006. http://www.spec.org/cpu2006.2006
Tang L, Mars J, Soffa ML (2011) Contentiousness vs. sensitivity: improving contention aware runtime systems on multicore architectures. In: Proceedings of the 1st international workshop on adaptive self-tuning computing systems for the Exaflop Era, San Jose, June 2011
Zhu X, Li K, Salah A (2013) A data parallel strategy for aligning multiple biological sequences on multi-core computers. Comput Biol Med 43(4):350–361
Article Google Scholar

Download references

Acknowledgments

We thank the anonymous reviewers for their insightful comments, which greatly improved the quality of this manuscript. This work is supported by National Science Foundation of China under Grant No. 61073011 and 61133004, and National High-Tech Program of China (863 program) under Grant No. 2012AA01A302.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Beihang University, Beijing, China
Ludan Zhang, Yi Liu, Rui Wang & Depei Qian

Authors

Ludan Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Yi Liu
View author publications
You can also search for this author inPubMed Google Scholar
Rui Wang
View author publications
You can also search for this author inPubMed Google Scholar
Depei Qian
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Yi Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, L., Liu, Y., Wang, R. et al. Lightweight dynamic partitioning for last-level cache of multicore processor on real system. J Supercomput 69, 547–560 (2014). https://doi.org/10.1007/s11227-014-1092-2

Download citation

Published: 20 January 2014
Issue Date: August 2014
DOI: https://doi.org/10.1007/s11227-014-1092-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Lightweight dynamic partitioning for last-level cache of multicore processor on real system

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

SRCP: sharing and reuse-aware replacement policy for the partitioned cache in multicore systems

An adaptive migration–replication scheme (AMR) for shared cache in chip multiprocessors

LPW: an efficient data-aware cache replacement strategy for Apache Spark

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now