Skip to main content
Log in

Timing analysis of concurrent programs running on shared cache multi-cores

  • Published:
Real-Time Systems Aims and scope Submit manuscript

Abstract

Memory accesses form an important source of timing unpredictability. Timing analysis of real-time embedded software thus requires bounding the time for memory accesses. Multiprocessing, a popular approach for performance enhancement, opens up the opportunity for concurrent execution. However due to contention for any shared memory by different processing cores, memory access behavior becomes more unpredictable, and hence harder to analyze. In this paper, we develop a timing analysis method for concurrent software running on multi-cores with a shared instruction cache. Communication across tasks is by message passing. Our method progressively improves the lifetime estimates of tasks that execute concurrently on multiple cores, in order to estimate potential conflicts in the shared cache. Possible conflicts arising from overlapping task lifetimes are accounted for in the hit-miss classification of accesses to the shared cache, to provide safe execution time bounds. We show that our method produces lower worst-case response time (WCRT) estimates than existing shared-cache analysis on a real-world embedded application. Furthermore, we also exploit instruction cache locking to improve WCRT. By locking some beneficial memory blocks into L1 cache, the WCET of the tasks and L2 cache conflicts are reduced, resulting in better WCRT. Experiments demonstrate that significant WCRT reduction is achieved through cache locking.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23
Fig. 24
Fig. 25
Fig. 26

Similar content being viewed by others

References

  • Analog devices (2009) ADSP-BF533 processor hardware reference. http://www.analog.com/static/imported-files/processor_manuals/bf533_hwr_Rev3.4.pdf

  • Alt M, Ferdinand C, Martin F, Wilhelm R (1996) Cache behavior prediction by abstract interpretation. In: Lecture notes in computer science, vol 1145, pp 52–66

    Google Scholar 

  • Alur R, Yannakakis M (1999) Model checking message sequence charts. In: Proceedings of the international conference on concurrency theory

    Google Scholar 

  • Arm (2004) ARM Cortex A-8 technical reference manual. Revised March 2004. http://www.arm.com/products/CPUs/families/ARMCortexFamily.html

  • Arm (2007) ARM1156T2-S technical reference Manual. Revised July 2007. http://www.arm.com/products/CPUs/families/ARM11Family.html

  • Austin T, Larson E, Ernst D (2002) SimpleScalar: an infrastructure for computer system modeling. IEEE Comput 35(2)

  • Baldawa S (2007) CMPSIM: A flexible multiprocessor simulation environment. Master’s thesis, The University of Texas at Dallas

  • Campoy AM et al. (2005) Cache contents selection for statically-locked instruction caches: an algorithm comparison. In: ECRTS’05: proceedings of the 17th Euromicro conference on real-time systems

    Google Scholar 

  • Chattopadhyay S, Roychoudhury A (2011) Static bus schedule aware scratchpad allocation in multiprocessors. In: Proceedings of the 2011 SIGPLAN/SIGBED conference on languages, compilers and tools for embedded systems, LCTES’11, pp 11–20

    Chapter  Google Scholar 

  • Chattopadhyay S, Roychoudhury A, Mitra T (2010) Modeling shared cache and bus in multi-cores for timing analysis. In: Proceedings of the 13th international workshop on software & compilers for embedded systems, SCOPES’10, pp 6:1–6:10

    Google Scholar 

  • Coutinho LMN, Mendes JLD, Martins CAPS (2006) MSCSim—multilevel and split cache simulator. In: 36th annual frontiers in education conference

    Google Scholar 

  • European Space Agency (2008) DEBIE—First standard space debris monitoring instrument. Available at http://gate.etamax.de/edid/publicaccess/debie1.php

  • Falk H, Plazar S, Theiling H (2007) Compile-time decided instruction cache locking using worst-case execution paths. In: CODES+ISSS’07: proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis

    Google Scholar 

  • Gustavsson A, Ermedahl A, Lisper B, Pettersson P (2010) Towards wcet analysis of multicaore architecures using uppaal. In: Proceedings of 10th international workshop on Worst-Case Execution-Time analysis, WCET’10

    Google Scholar 

  • Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the real-time systems symposium

    Google Scholar 

  • Hardy D, Piquet T, Puaut I (2009) Using bypass to tighten wcet estimates for multi-core processors with shared instruction caches. In: Proceedings of the 2009 30th IEEE real-time systems symposium, RTSS’09

    Google Scholar 

  • Heckmann R et al (2003) The influence of processor architecture on the design and the results of WCET tools. Proc IEEE 9(7)

  • Intel (2007) 3rd Generation Intel Xscale microarchitecture developers’s manual. May 2007. http://www.intel.com/design/intelxscale

  • ITU (1996) Message sequence charts. ITU-TS Recommendation Z.120

  • Lee C-G et al. (1998) Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans Comput 47(6):700–713

    Article  MathSciNet  Google Scholar 

  • Lee JW, Asanovic K (2006) METERG: measurement-based end-to-end performance estimation technique in QoS-capable multiprocessors. In: Proceedings of the IEEE real-time and embedded technology and applications symposium

    Google Scholar 

  • Li X, Liang Y, Mitra T, Roychoudhury A (2007) Chronos: A timing analyzer for embedded software. Sci Comput Program 69(1–3):56–67. Available at http://www.comp.nus.edu.sg/~rpembed/chronos/

    Article  MathSciNet  MATH  Google Scholar 

  • Li Y-TS, Malik S, Wolfe A (1996) Cache modeling for real-time software: beyond direct mapped instruction caches. In: Proceedings of the real-time systems symposium

    Google Scholar 

  • Liang Y, Mitra T (2010) Instruction cache locking using temporal reuse profile. In: DAC’10: proceedings of the 47th annual design automation conference

    Google Scholar 

  • Liu T, Li M, Xue CJ (2009) Minimizing WCET for real-time embedded systems via static instruction cache locking. In: RTAS’09: proceedings of the 15th IEEE real-time and embedded technology and applications symposium

    Google Scholar 

  • Lundqvist T, Stenstrom P (1999) An integrated path and timing analysis method based on cycle-level symbolic execution. Real-Time Syst 17(2–3)

    Google Scholar 

  • Mueller F (2000) Timing analysis for instruction caches. Real-Time Syst. 18(2–3)

  • Negi HS, Mitra T, Roychoudhury A (2003) Accurate estimation of cache-related preemption delay. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis

    Google Scholar 

  • Nemer F, Cassé H, Sainrat P, Bahsoun JP, De Michiel M (2006) Papabench: a free real-time benchmark. In: WCET’06

    Google Scholar 

  • Puaut I, Decotigny D (2002) Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In: RTSS’02: proceedings of the 23rd IEEE real-time systems symposium

    Google Scholar 

  • Puschner P, Schoeberl M (2008) On composable system timing, task timing, and WCET analysis. In: International workshop on worst-case execution time analysis

    Google Scholar 

  • Schliecker S, Negrean M, Nicolescu G, Paulin P, Ernst R (2008) Reliable performance analysis of a multicore multithreaded system-on-chip. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis

    Google Scholar 

  • Staschulat J, Ernst R (2004) Multiple process execution in cache related preemption delay analysis. In: Proceedings of the 4th ACM international conference on embedded software

    Google Scholar 

  • Suhendra V, Mitra T, Roychoudhury A, Chen T (2006) Efficient detection and exploitation of infeasible paths for software timing analysis. In: Proceedings of the design automation conference

    Google Scholar 

  • Tan Y, Mooney V (2005) WCRT analysis for a uniprocessor with a unified prioritized cache. In: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded systems

    Google Scholar 

  • Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analyses. Real-Time Syst 18(2/3)

  • Tomiyama H, Dutt ND (2000) Program path analysis to bound cache-related preemption delay in preemptive real-time systems. In: Proceedings of the eighth international workshop on hardware/software codesign

    Google Scholar 

  • Yan J, Zhang W (2008) WCET analysis for multi-core processors with shared L2 instruction caches. In: Proceedings of the IEEE real-time and embedded technology and applications symposium

    Google Scholar 

  • Zhang W, Yan J (2009) Accurately estimating worst-case execution time for multi-core processors with shared direct-mapped instruction caches. In: Proceedings of the 2009 15th IEEE international conference on embedded and real-time computing systems and applications, RTCSA’09

    Google Scholar 

Download references

Acknowledgements

This work was partially supported by Singapore Ministry of Education Academic Research Fund Tier 2, MOE2009-T2-1-033.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tulika Mitra.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liang, Y., Ding, H., Mitra, T. et al. Timing analysis of concurrent programs running on shared cache multi-cores. Real-Time Syst 48, 638–680 (2012). https://doi.org/10.1007/s11241-012-9160-2

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11241-012-9160-2

Keywords

Navigation