On modeling contention for shared caches in multi-core processors with techniques from ecology

Emeneker, Wesley; Apon, Amy

doi:10.1007/s11047-012-9348-3

On modeling contention for shared caches in multi-core processors with techniques from ecology

Published: 20 October 2012

Volume 12, pages 411–428, (2013)
Cite this article

Natural Computing Aims and scope Submit manuscript

Wesley Emeneker¹ &
Amy Apon²

326 Accesses
3 Citations
Explore all metrics

Abstract

Multi-core x86_64 processors introduced an important change in architecture, a shared last level cache. Historically, each processor has had access to a large private cache that seamlessly and transparently (to end users) interfaced with main memory. Previously, processes or threads only had to compete for memory bandwidth, but now they are competing for actual space. Competition for space and environmental resources is a problem studied in other scientific domains. This paper introduces methods from ecology to model multi-core cache usage with the competitive Lotka–Volterra equations. A model is presented and validated for characterizing the interaction of cores through shared caching, and for predicting the degree to which different workloads will interfere with each others’ execution from cache contention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Cross-Core Performance Model for Heterogeneous Many-Core Architectures

Modeling Large Compute Nodes with Heterogeneous Memories with Cache-Aware Roofline Model

A Scalable Analytical Memory Model for CPU Performance Prediction

References

Agarwal A (1992) Performance tradeoffs in multithreaded processors. IEEE Trans Parallel Distrib Syst 3(5):525–539
Article Google Scholar
Agarwal A, Hennessy J, Horowitz M (1989) An analytical cache model. ACM Trans Comput Syst 7:184–215
Article Google Scholar
Aho AV, Denning PJ, Ullman JD (1971) Principles of optimal page replacement. J ACM 18:80–93
Article MathSciNet MATH Google Scholar
Antoniou S, Lambropoulou S (2008) Dynamical systems and topological surgery. ArXiv e-prints
Berryman AA (1992) The origins and evolution of predator–prey theory. Ecol Freshw Fish 73:1520–1535
Article Google Scholar
Boyd-Wickizer S, Morris R, Kaashoek MF (2009) Reinventing scheduling for multicore systems. In: Proceedings of the 12th conference on Hot topics in operating systems, HotOS’09. USENIX Association, Berkeley, CA, p 21
Capitn JA, Cuesta JA (2010) Species assembly in model ecosystems, I: analysis of the population model and the invasion dynamics. J Theor Biol 269(1):330–343
Google Scholar
Chandra D, Guo F, Kim S, Solihin Y (2005) Predicting inter-thread cache contention on a chip multi-processor architecture. In: Proceedings of the 11th international symposium on high-performance computer architecture. IEEE Computer Society, Washington, pp 340–351
Emeneker W, Apon A (2010) Cache effects of virtual machine placement on multi-core processors. International conference on computer and information technology, pp 2261–2266
Emeneker W, Apon A (2012) Characterising the performance of cache-aware placement of virtual machines on a multi-core architecture. Int J Ad Hoc Ubiquitous Comput 10(2):84–95
Google Scholar
Fedorova A, Seltzer M, Smith MD (2007) Improving performance isolation on chip multiprocessors via an operating system scheduler. In: Proceedings of the 16th international conference on parallel architecture and compilation techniques, PACT ’07. IEEE Computer Society, Washington, pp 25–38
Harper JS, Kerbyson DJ, Nudd GR (1999) Analytical modeling of set-associative cache behavior. IEEE Trans Comput 48:1009–1024
Article Google Scholar
Hou Z (2007) Global attractor in competitive Lotka–Volterra systems with retardation. ArXiv e-prints
Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9(3):90–95
Article Google Scholar
Jiang Y, Tian K, Shen X (2010) Combining locality analysis with online proactive job co-scheduling in chip multiprocessors. In Patt Y, Foglia P, Duesterwald E, Faraboschi P, Martorell X (eds) High performance embedded architectures and compilers, vol 5952 of lecture notes in computer science. Springer, Berlin, pp 201–215
Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for Python (online)
Jost C, Devulder G, Peterson RO, Arditi R (2005) The wolves of Isle Royale display scale-invariant satiation and ratio-dependent predation on moose. J Anim Ecol 74(5):809–816
Article Google Scholar
Kaplan SF, McGeoch LA, Cole MF (2002) Adaptive caching for demand prepaging. SIGPLAN Not 38:114–126
Article Google Scholar
Kaseridis D, Stuecheli J, John LK (2009) Bank-aware dynamic cache partitioning for multicore architectures. In: International conference on parallel processing, pp 18–25
Kessler RE, Hill MD (1992) Page placement algorithms for large real-indexed caches. ACM Trans Comput Syst 10:338–359
Article Google Scholar
Levon J, Elie P (2008) Oprofile: a system-wide Profiler for Linux Systems. http://oprofile.sourceforge.net
Lin J, Lu Q, Ding X, Zhang Z, Zhang X, Sadayappan P (2008) Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: IEEE 14th international symposium on high performance computer architecture, 2008. HPCA 2008, pp 367–378
Malcai O, Biham O, Richmond P, Solomon S (2002) Theoretical analysis and simulations of the generalized Lotka–Volterra model. Phys Rev E 66(3):031102/1–031102/4
Google Scholar
Nethercote N, Seward J (2007) Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Not 42:89–100
Article Google Scholar
Oden PH, Shedler GS (1972) A model of memory contention in a paging machine. Commun ACM 15:761–771
Article MathSciNet MATH Google Scholar
Odum E (1971) Fundamentals of ecology, 3rd edn. W. B. Saunders Co., Philadelphia
Google Scholar
Oliver NA (1974) Experimental data on page replacement algorithm. In: Proceedings of the national computer conference and exposition, AFIPS ’74, ACM, New York, pp 179–184
Petoumenos P, Keramidas G, Zeffer H, Kaxiras S, Hagersten E (2006) Modeling cache sharing on chip multiprocessor architectures. In: IEEE International Symposium on workload characterization, 2006, pp 160–171
Qureshi MK, Patt YN. (2006) Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture, MICRO 39. IEEE Computer Society, Washington, pp 423–432
Sainil S, Bailey DH (1996) NAS parallel benchmark (version 1.0) results 11-96, November 1996
Shi X, Su F, Peir J-K, Xia Y, Yang Z (2009) Modeling and stack simulation of CMP cache capacity and accessibility. IEEE Trans Parallel Distrib Syst 20:1752–1763
Article Google Scholar
Smith AJ (1981) Internal scheduling and memory contention. IEEE Trans Softw Eng SE-7(1):135–146
Article Google Scholar
Song F, Moore S, Dongarra J (2007) L2 cache modeling for scientific applications on chip multi-processors. In: International conference on parallel processing, 2007. ICPP 2007, p 51
Suh GE, Devadas S, Rudolph L (2001) Analytical cache models with applications to cache partitioning. In: Proceedings of the 15th international conference on supercomputing, ICS’01. ACM, New York, pp 1–12
Tam D, Azimi R, Stumm M (2007) Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European conference on computer systems 2007, EuroSys ’07. ACM, New York, pp 47–58
Tam DK, Azimi R, Soares LB, Stumm M (2009) RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations. SIGPLAN Not 44:121–132
Article Google Scholar
Xue J, Vera X (2004) Efficient and accurate analytical modeling of whole-program data cache behavior. IEEE Trans Comput 53(5):547–566
Article Google Scholar
Zhang X, Dwarkadas S, Shen K (2009) Towards practical page coloring-based multicore cache management. In: Proceedings of the 4th ACM European conference on computer systems, EuroSys ’09. ACM, New York, pp 89–102
Zhang EZ, Jiang Y, Shen X (2010) Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? In: Proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP’10. ACM, New York, pp 203–212
Zhuravlev S, Blagodurov S, Fedorova A (2010) Addressing shared resource contention in multicore processors via scheduling. SIGPLAN Not 45:129–142
Article Google Scholar

Download references

Acknowledgments

This work supported in part by NSF grant MRI#0722625. Figures generated with matplotlib (Hunter 2007): http://matplotlib.sf.net The authors thank the anonymous reviewers for their helpful and insightful suggestions.

Author information

Authors and Affiliations

Georgia Institute of Technology, Rich Building 328, 258 Fourth St. NW, Atlanta, GA, 30332, USA
Wesley Emeneker
Clemson University, 100 McAdams Hall, Clemson, SC, 29634, USA
Amy Apon

Authors

Wesley Emeneker
View author publications
You can also search for this author in PubMed Google Scholar
Amy Apon
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wesley Emeneker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Emeneker, W., Apon, A. On modeling contention for shared caches in multi-core processors with techniques from ecology. Nat Comput 12, 411–428 (2013). https://doi.org/10.1007/s11047-012-9348-3

Download citation

Published: 20 October 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s11047-012-9348-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

On modeling contention for shared caches in multi-core processors with techniques from ecology

Abstract

Access this article

Similar content being viewed by others

A Cross-Core Performance Model for Heterogeneous Many-Core Architectures

Modeling Large Compute Nodes with Heterogeneous Memories with Cache-Aware Roofline Model

A Scalable Analytical Memory Model for CPU Performance Prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

On modeling contention for shared caches in multi-core processors with techniques from ecology

Abstract

Access this article

Similar content being viewed by others

A Cross-Core Performance Model for Heterogeneous Many-Core Architectures

Modeling Large Compute Nodes with Heterogeneous Memories with Cache-Aware Roofline Model

A Scalable Analytical Memory Model for CPU Performance Prediction

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation