Skip to main content
Log in

Compositional, Dynamic Cache Management for Embedded Chip Multiprocessors

  • Published:
Journal of Signal Processing Systems Aims and scope Submit manuscript

Abstract

This paper proposes a dynamic cache repartitioning technique that enhances compositionality on platforms executing media applications with multiple utilization scenarios. Because the repartitioning between scenarios requires a cache flush, two undesired effects may occur: (1) in particular, the execution of critical tasks may be disturbed and (2) in general, a performance penalty is involved. To cope with these effects we propose a method which: (1) determines, at design time, the cache footprint of each tasks, such that it creates the premises for critical tasks safety, and minimum flush in general, and (2) enforces, at run-time, the design time determined cache footprints and further decreases the flush penalty. We implement our dynamic cache management strategy on a CAKE multiprocessor with 4 Trimedia cores. The experimental workload consists of 6 multimedia applications, each of which formed by multiple tasks belonging to an extended MediaBench suite. We found on average that: (1) the relative variations of critical tasks execution time are less than 0.1%, regardless of the scenario switching frequency, (2) for realistic scenario switching frequencies the inter-task cache interference is at most 4% for the repartitioned cache, whereas for the shared cache it reaches 68%, and (3) the off-chip memory traffic reduces with 60%, and the performance (in cycles per instruction) enhances with 10%, when compared with the shared cache.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9

Similar content being viewed by others

References

  1. Blelloch, G. E., & Gibbons, P. B. (2004). Effectively sharing a cache among threads. In Proceeding of SPAA (pp. 235–244).

  2. Chandra, D., Guo, F., Kim, S., & Solihin, Y. (2005). Predicting inter-thread cache contention on a chip multi-processor architecture. In Proceeding of HPCA (pp. 340–351).

  3. Chiou, D. T. (1999). Extending the reach of microprocessors: Column and curious caching. PhD thesis, Department of EECS, MIT, Cambridge, MA.

  4. Chunho, L., Potkonjak, M., & Mangione-Smith, W. (1997). Mediabench: A tool for evaluating and synthesizing multimedia and communicatons systems. In Proceedings, international symposium on microarchitecture.

  5. Dybdahl, H., & Stenstrom, P. (2007). An adaptive shared/private nuca cache partitioning scheme for chip multiprocessors. In Proceeding of IEEE international symposium of high performance computer architecture (pp. 2–12).

  6. Garey, M. R., & Johnson, D. S. (1979). Computers and intractability: A guide to the theory of NP-completeness. New York: W. H. Freeman.

    MATH  Google Scholar 

  7. Hennesy, J. L., & Patterson, D. A. (2003). Computer architecture: A quantitative approach. San Fransisco: Morgan Kaufmann.

    Google Scholar 

  8. Irwin, J., May, D., Muller, H., & Page, D. (2002). Predictable instruction caching for media processors. In 13th International conference on application-specific systems, architectures and processors (ASAP) (pp. 141–150).

  9. Iyer, R. (2004). Cqos: A framework for enabling qos in shared caches of cmp platforms. In Proceeding of the 18th annual international conference on supercomputing (pp. 257–266).

  10. Kim, S., Chandra, D., & Solihin, Y. (2004). Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proceeding of IEEE PACT (pp. 111–122).

  11. Kirk, D. B. (1989). Smart (strategic memory allocation for real-time) cache design. In IEEE symposium on real time systems (pp. 229–237).

  12. Liedtke, J., Härtig, H., & Hohmuth, M. (1997). Os-controlled cache predictability for real-time systems. In 3rd IEEE real-time technology and applications symposium.

  13. Molnos, A. (2008). Task centric memory management for an on-chip multiprocessor. PhD Thesis, Technical University of Delft (to appear).

  14. Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2004). Compositional memory systems for data intensive applications. In Proceedings, design, automation and test in Europe (pp. 728–729).

  15. Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2005). Compositional memory systems for multimedia communicating tasks. In Proceedings, DATE.

  16. Molnos, A., Heijligers, M., Cotofana, S., & van Eijndhoven, J. (2006). Compositional, efficient caches for a chip multi-processor. In Proceedings, design, automation and test in Europe.

  17. Moore, G. (1965). Cramming more components on integrated circuits. Electronics, April 19.

  18. Mueller, F. (1995). Compiler support for software-based cache partitioning. ACM SIGPLAN Notices, 30(11), 137–145.

    Article  Google Scholar 

  19. Muller, H., Page, D., Irwin, J., & May, D. (2002). Caches with compositional performance. In Proceedings, embedded processor design challenges (pp. 242–259).

  20. Nayfeh, B. A., & Olukotun, K. (1994). Exploring the design space for a shared-cache multiprocessor. In 21st Annual Int. Symp. Computer Architecture (pp. 166–175).

  21. Sebek, F. (2001). The state of the art in cache memories and real-time systems. MRTC Technical Report (01/37).

  22. Settle, A., Connors, D., Gibert, E., & González, A. (2006). A dynamically reconfigurable cache for multithreaded processors. Journal of Embedded Computing, 2, 221–233.

    Google Scholar 

  23. Suh, G. E., Rudolph, L., & Devadas, S. (2004). Dynamic partitioning of shared cache memory. The Journal of Supercomputing, 28(1), 7–26.

    Article  MATH  Google Scholar 

  24. Tan, Y., & Mooney, V. (2003). A prioritized cache for multi-tasking real-time systems. In Proceedings of the 11th workshop on synthesis and system integration of mixed information technologies (pp. 168–175).

  25. Terechko, A. (2005). Hardware cache coherence prototyping for the tm2270 trimedia. Philips Research Technical Note PR-TN 2005/00312.

  26. van Eijndhoven, J. T., Hoogerbrugge, J., Jayram, M., Stravers, P., & Terechko, A. (2005a). Cache-coherent heterogeneous multiprocessing as basis for streaming applications. In Dynamic and robust streaming between connected ce-devices. Boston: Kluwer.

    Google Scholar 

  27. van Eijndhoven, J. T., Hoogerbrugge, J., Jayram, M., Stravers, P., & Terechko, A. (2005b). Dynamic and robust streaming between connected CE-devices. Boston: Kluwer.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anca M. Molnos.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Molnos, A.M., Cotofana, S.D., Heijligers, M.J.M. et al. Compositional, Dynamic Cache Management for Embedded Chip Multiprocessors. J Sign Process Syst Sign Image Video Technol 57, 155–172 (2009). https://doi.org/10.1007/s11265-008-0276-5

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11265-008-0276-5

Keywords

Navigation