Abstract
This paper proposes a dynamic cache repartitioning technique that enhances compositionality on platforms executing media applications with multiple utilization scenarios. Because the repartitioning between scenarios requires a cache flush, two undesired effects may occur: (1) in particular, the execution of critical tasks may be disturbed and (2) in general, a performance penalty is involved. To cope with these effects we propose a method which: (1) determines, at design time, the cache footprint of each tasks, such that it creates the premises for critical tasks safety, and minimum flush in general, and (2) enforces, at run-time, the design time determined cache footprints and further decreases the flush penalty. We implement our dynamic cache management strategy on a CAKE multiprocessor with 4 Trimedia cores. The experimental workload consists of 6 multimedia applications, each of which formed by multiple tasks belonging to an extended MediaBench suite. We found on average that: (1) the relative variations of critical tasks execution time are less than 0.1%, regardless of the scenario switching frequency, (2) for realistic scenario switching frequencies the inter-task cache interference is at most 4% for the repartitioned cache, whereas for the shared cache it reaches 68%, and (3) the off-chip memory traffic reduces with 60%, and the performance (in cycles per instruction) enhances with 10%, when compared with the shared cache.
Similar content being viewed by others
References
Blelloch, G. E., & Gibbons, P. B. (2004). Effectively sharing a cache among threads. In Proceeding of SPAA (pp. 235–244).
Chandra, D., Guo, F., Kim, S., & Solihin, Y. (2005). Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceeding of HPCA (pp. 340–351).
Chiou, D. T. (1999). Extending the reach of microprocessors: Column and curious caching. PhD thesis, Department of EECS, MIT, Cambridge, MA.
Chunho, L., Potkonjak, M., & Mangione-Smith, W. (1997). Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings, international symposium on microarchitecture.
Dybdahl, H., & Stenstrom, P. (2007). An adaptive shared/private nuca cache partitioning scheme for chip multiprocessors. In Proceeding of IEEE international symposium of high performance computer architecture (pp. 2–12).
Garey, M. R., & Johnson, D. S. (1979). Computers and intractability: A guide to the theory of NP-completeness. New York: W. H. Freeman.
Hennesy, J. L., & Patterson, D. A. (2003). Computer architecture: A quantitative approach. San Fransisco: Morgan Kaufmann.
Irwin, J., May, D., Muller, H., & Page, D. (2002). Predictable instruction caching for media processors. In 13th International conference on application-specific systems, architectures and processors (ASAP) (pp. 141–150).
Iyer, R. (2004). Cqos: A framework for enabling qos in shared caches of cmp platforms. In Proceeding of the 18th annual international conference on supercomputing (pp. 257–266).
Kim, S., Chandra, D., & Solihin, Y. (2004). Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceeding of IEEE PACT (pp. 111–122).
Kirk, D. B. (1989). Smart (strategic memory allocation for real-time) cache design. In IEEE symposium on real time systems (pp. 229–237).
Liedtke, J., Härtig, H., & Hohmuth, M. (1997). Os-controlled cache predictability for real-time systems. In 3rd IEEE real-time technology and applications symposium.
Molnos, A. (2008). Task centric memory management for an on-chip multiprocessor. PhD Thesis, Technical University of Delft (to appear).
Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2004). Compositional memory systems for data intensive applications. In Proceedings, design, automation and test in Europe (pp. 728–729).
Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2005). Compositional memory systems for multimedia communicating tasks. In Proceedings, DATE.
Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2006). Compositional, efficient caches for a chip multi-processor. In Proceedings, design, automation and test in Europe.
Moore, G. (1965). Cramming more components on integrated circuits. Electronics, April 19.
Mueller, F. (1995). Compiler support for software-based cache partitioning. ACM SIGPLAN Notices, 30(11), 137–145.
Muller, H., Page, D., Irwin, J., & May, D. (2002). Caches with compositional performance. In Proceedings, embedded processor design challenges (pp. 242–259).
Nayfeh, B. A., & Olukotun, K. (1994). Exploring the design space for a shared-cache multiprocessor. In 21st Annual Int. Symp. Computer Architecture (pp. 166–175).
Sebek, F. (2001). The state of the art in cache memories and real-time systems. MRTC Technical Report (01/37).
Settle, A., Connors, D., Gibert, E., & González, A. (2006). A dynamically reconfigurable cache for multithreaded processors. Journal of Embedded Computing, 2, 221–233.
Suh, G. E., Rudolph, L., & Devadas, S. (2004). Dynamic partitioning of shared cache memory. The Journal of Supercomputing, 28(1), 7–26.
Tan, Y., & Mooney, V. (2003). A prioritized cache for multi-tasking real-time systems. In Proceedings of the 11th workshop on synthesis and system integration of mixed information technologies (pp. 168–175).
Terechko, A. (2005). Hardware cache coherence prototyping for the tm2270 trimedia. Philips Research Technical Note PR-TN 2005/00312.
van Eijndhoven, J. T., Hoogerbrugge, J., Jayram, M., Stravers, P., & Terechko, A. (2005a). Cache-coherent heterogeneous multiprocessing as basis for streaming applications. In Dynamic and robust streaming between connected ce-devices. Boston: Kluwer.
van Eijndhoven, J. T., Hoogerbrugge, J., Jayram, M., Stravers, P., & Terechko, A. (2005b). Dynamic and robust streaming between connected CE-devices. Boston: Kluwer.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Molnos, A.M., Cotofana, S.D., Heijligers, M.J.M. et al. Compositional, Dynamic Cache Management for Embedded Chip Multiprocessors. J Sign Process Syst Sign Image Video Technol 57, 155–172 (2009). https://doi.org/10.1007/s11265-008-0276-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-008-0276-5