Skip to main content
Log in

On modeling contention for shared caches in multi-core processors with techniques from ecology

  • Published:
Natural Computing Aims and scope Submit manuscript

Abstract

Multi-core x86_64 processors introduced an important change in architecture, a shared last level cache. Historically, each processor has had access to a large private cache that seamlessly and transparently (to end users) interfaced with main memory. Previously, processes or threads only had to compete for memory bandwidth, but now they are competing for actual space. Competition for space and environmental resources is a problem studied in other scientific domains. This paper introduces methods from ecology to model multi-core cache usage with the competitive Lotka–Volterra equations. A model is presented and validated for characterizing the interaction of cores through shared caching, and for predicting the degree to which different workloads will interfere with each others’ execution from cache contention.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Agarwal A (1992) Performance tradeoffs in multithreaded processors. IEEE Trans Parallel Distrib Syst 3(5):525–539

    Article  Google Scholar 

  • Agarwal A, Hennessy J, Horowitz M (1989) An analytical cache model. ACM Trans Comput Syst 7:184–215

    Article  Google Scholar 

  • Aho AV, Denning PJ, Ullman JD (1971) Principles of optimal page replacement. J ACM 18:80–93

    Article  MathSciNet  MATH  Google Scholar 

  • Antoniou S, Lambropoulou S (2008) Dynamical systems and topological surgery. ArXiv e-prints

  • Berryman AA (1992) The origins and evolution of predator–prey theory. Ecol Freshw Fish 73:1520–1535

    Article  Google Scholar 

  • Boyd-Wickizer S, Morris R, Kaashoek MF (2009) Reinventing scheduling for multicore systems. In: Proceedings of the 12th conference on Hot topics in operating systems, HotOS’09. USENIX Association, Berkeley, CA, p 21

  • Capitn JA, Cuesta JA (2010) Species assembly in model ecosystems, I: analysis of the population model and the invasion dynamics. J Theor Biol 269(1):330–343

    Google Scholar 

  • Chandra D, Guo F, Kim S, Solihin Y (2005) Predicting inter-thread cache contention on a chip multi-processor architecture. In: Proceedings of the 11th international symposium on high-performance computer architecture. IEEE Computer Society, Washington, pp 340–351

  • Emeneker W, Apon A (2010) Cache effects of virtual machine placement on multi-core processors. International conference on computer and information technology, pp 2261–2266

  • Emeneker W, Apon A (2012) Characterising the performance of cache-aware placement of virtual machines on a multi-core architecture. Int J Ad Hoc Ubiquitous Comput 10(2):84–95

    Google Scholar 

  • Fedorova A, Seltzer M, Smith MD (2007) Improving performance isolation on chip multiprocessors via an operating system scheduler. In: Proceedings of the 16th international conference on parallel architecture and compilation techniques, PACT ’07. IEEE Computer Society, Washington, pp 25–38

  • Harper JS, Kerbyson DJ, Nudd GR (1999) Analytical modeling of set-associative cache behavior. IEEE Trans Comput 48:1009–1024

    Article  Google Scholar 

  • Hou Z (2007) Global attractor in competitive Lotka–Volterra systems with retardation. ArXiv e-prints

  • Hunter JD (2007) Matplotlib: a 2D graphics environment. Comput Sci Eng 9(3):90–95

    Article  Google Scholar 

  • Jiang Y, Tian K, Shen X (2010) Combining locality analysis with online proactive job co-scheduling in chip multiprocessors. In Patt Y, Foglia P, Duesterwald E, Faraboschi P, Martorell X (eds) High performance embedded architectures and compilers, vol 5952 of lecture notes in computer science. Springer, Berlin, pp 201–215

  • Jones E, Oliphant T, Peterson P et al (2001) SciPy: open source scientific tools for Python (online)

  • Jost C, Devulder G, Peterson RO, Arditi R (2005) The wolves of Isle Royale display scale-invariant satiation and ratio-dependent predation on moose. J Anim Ecol 74(5):809–816

    Article  Google Scholar 

  • Kaplan SF, McGeoch LA, Cole MF (2002) Adaptive caching for demand prepaging. SIGPLAN Not 38:114–126

    Article  Google Scholar 

  • Kaseridis D, Stuecheli J, John LK (2009) Bank-aware dynamic cache partitioning for multicore architectures. In: International conference on parallel processing, pp 18–25

  • Kessler RE, Hill MD (1992) Page placement algorithms for large real-indexed caches. ACM Trans Comput Syst 10:338–359

    Article  Google Scholar 

  • Levon J, Elie P (2008) Oprofile: a system-wide Profiler for Linux Systems. http://oprofile.sourceforge.net

  • Lin J, Lu Q, Ding X, Zhang Z, Zhang X, Sadayappan P (2008) Gaining insights into multicore cache partitioning: bridging the gap between simulation and real systems. In: IEEE 14th international symposium on high performance computer architecture, 2008. HPCA 2008, pp 367–378

  • Malcai O, Biham O, Richmond P, Solomon S (2002) Theoretical analysis and simulations of the generalized Lotka–Volterra model. Phys Rev E 66(3):031102/1–031102/4

    Google Scholar 

  • Nethercote N, Seward J (2007) Valgrind: a framework for heavyweight dynamic binary instrumentation. SIGPLAN Not 42:89–100

    Article  Google Scholar 

  • Oden PH, Shedler GS (1972) A model of memory contention in a paging machine. Commun ACM 15:761–771

    Article  MathSciNet  MATH  Google Scholar 

  • Odum E (1971) Fundamentals of ecology, 3rd edn. W. B. Saunders Co., Philadelphia

    Google Scholar 

  • Oliver NA (1974) Experimental data on page replacement algorithm. In: Proceedings of the national computer conference and exposition, AFIPS ’74, ACM, New York, pp 179–184

  • Petoumenos P, Keramidas G, Zeffer H, Kaxiras S, Hagersten E (2006) Modeling cache sharing on chip multiprocessor architectures. In: IEEE International Symposium on workload characterization, 2006, pp 160–171

  • Qureshi MK, Patt YN. (2006) Utility-based cache partitioning: a low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th annual IEEE/ACM international symposium on microarchitecture, MICRO 39. IEEE Computer Society, Washington, pp 423–432

  • Sainil S, Bailey DH (1996) NAS parallel benchmark (version 1.0) results 11-96, November 1996

  • Shi X, Su F, Peir J-K, Xia Y, Yang Z (2009) Modeling and stack simulation of CMP cache capacity and accessibility. IEEE Trans Parallel Distrib Syst 20:1752–1763

    Article  Google Scholar 

  • Smith AJ (1981) Internal scheduling and memory contention. IEEE Trans Softw Eng SE-7(1):135–146

    Article  Google Scholar 

  • Song F, Moore S, Dongarra J (2007) L2 cache modeling for scientific applications on chip multi-processors. In: International conference on parallel processing, 2007. ICPP 2007, p 51

  • Suh GE, Devadas S, Rudolph L (2001) Analytical cache models with applications to cache partitioning. In: Proceedings of the 15th international conference on supercomputing, ICS’01. ACM, New York, pp 1–12

  • Tam D, Azimi R, Stumm M (2007) Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In: Proceedings of the 2nd ACM SIGOPS/EuroSys European conference on computer systems 2007, EuroSys ’07. ACM, New York, pp 47–58

  • Tam DK, Azimi R, Soares LB, Stumm M (2009) RapidMRC: approximating L2 miss rate curves on commodity systems for online optimizations. SIGPLAN Not 44:121–132

    Article  Google Scholar 

  • Xue J, Vera X (2004) Efficient and accurate analytical modeling of whole-program data cache behavior. IEEE Trans Comput 53(5):547–566

    Article  Google Scholar 

  • Zhang X, Dwarkadas S, Shen K (2009) Towards practical page coloring-based multicore cache management. In: Proceedings of the 4th ACM European conference on computer systems, EuroSys ’09. ACM, New York, pp 89–102

  • Zhang EZ, Jiang Y, Shen X (2010) Does cache sharing on modern CMP matter to the performance of contemporary multithreaded programs? In: Proceedings of the 15th ACM SIGPLAN symposium on principles and practice of parallel programming, PPoPP’10. ACM, New York, pp 203–212

  • Zhuravlev S, Blagodurov S, Fedorova A (2010) Addressing shared resource contention in multicore processors via scheduling. SIGPLAN Not 45:129–142

    Article  Google Scholar 

Download references

Acknowledgments

This work supported in part by NSF grant MRI#0722625. Figures generated with matplotlib (Hunter 2007): http://matplotlib.sf.net The authors thank the anonymous reviewers for their helpful and insightful suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wesley Emeneker.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Emeneker, W., Apon, A. On modeling contention for shared caches in multi-core processors with techniques from ecology. Nat Comput 12, 411–428 (2013). https://doi.org/10.1007/s11047-012-9348-3

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11047-012-9348-3

Keywords

Navigation