Abstract
Memory accesses form an important source of timing unpredictability. Timing analysis of real-time embedded software thus requires bounding the time for memory accesses. Multiprocessing, a popular approach for performance enhancement, opens up the opportunity for concurrent execution. However due to contention for any shared memory by different processing cores, memory access behavior becomes more unpredictable, and hence harder to analyze. In this paper, we develop a timing analysis method for concurrent software running on multi-cores with a shared instruction cache. Communication across tasks is by message passing. Our method progressively improves the lifetime estimates of tasks that execute concurrently on multiple cores, in order to estimate potential conflicts in the shared cache. Possible conflicts arising from overlapping task lifetimes are accounted for in the hit-miss classification of accesses to the shared cache, to provide safe execution time bounds. We show that our method produces lower worst-case response time (WCRT) estimates than existing shared-cache analysis on a real-world embedded application. Furthermore, we also exploit instruction cache locking to improve WCRT. By locking some beneficial memory blocks into L1 cache, the WCET of the tasks and L2 cache conflicts are reduced, resulting in better WCRT. Experiments demonstrate that significant WCRT reduction is achieved through cache locking.


























Similar content being viewed by others
References
Analog devices (2009) ADSP-BF533 processor hardware reference. http://www.analog.com/static/imported-files/processor_manuals/bf533_hwr_Rev3.4.pdf
Alt M, Ferdinand C, Martin F, Wilhelm R (1996) Cache behavior prediction by abstract interpretation. In: Lecture notes in computer science, vol 1145, pp 52–66
Alur R, Yannakakis M (1999) Model checking message sequence charts. In: Proceedings of the international conference on concurrency theory
Arm (2004) ARM Cortex A-8 technical reference manual. Revised March 2004. http://www.arm.com/products/CPUs/families/ARMCortexFamily.html
Arm (2007) ARM1156T2-S technical reference Manual. Revised July 2007. http://www.arm.com/products/CPUs/families/ARM11Family.html
Austin T, Larson E, Ernst D (2002) SimpleScalar: an infrastructure for computer system modeling. IEEE Comput 35(2)
Baldawa S (2007) CMPSIM: A flexible multiprocessor simulation environment. Master’s thesis, The University of Texas at Dallas
Campoy AM et al. (2005) Cache contents selection for statically-locked instruction caches: an algorithm comparison. In: ECRTS’05: proceedings of the 17th Euromicro conference on real-time systems
Chattopadhyay S, Roychoudhury A (2011) Static bus schedule aware scratchpad allocation in multiprocessors. In: Proceedings of the 2011 SIGPLAN/SIGBED conference on languages, compilers and tools for embedded systems, LCTES’11, pp 11–20
Chattopadhyay S, Roychoudhury A, Mitra T (2010) Modeling shared cache and bus in multi-cores for timing analysis. In: Proceedings of the 13th international workshop on software & compilers for embedded systems, SCOPES’10, pp 6:1–6:10
Coutinho LMN, Mendes JLD, Martins CAPS (2006) MSCSim—multilevel and split cache simulator. In: 36th annual frontiers in education conference
European Space Agency (2008) DEBIE—First standard space debris monitoring instrument. Available at http://gate.etamax.de/edid/publicaccess/debie1.php
Falk H, Plazar S, Theiling H (2007) Compile-time decided instruction cache locking using worst-case execution paths. In: CODES+ISSS’07: proceedings of the 5th IEEE/ACM international conference on hardware/software codesign and system synthesis
Gustavsson A, Ermedahl A, Lisper B, Pettersson P (2010) Towards wcet analysis of multicaore architecures using uppaal. In: Proceedings of 10th international workshop on Worst-Case Execution-Time analysis, WCET’10
Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the real-time systems symposium
Hardy D, Piquet T, Puaut I (2009) Using bypass to tighten wcet estimates for multi-core processors with shared instruction caches. In: Proceedings of the 2009 30th IEEE real-time systems symposium, RTSS’09
Heckmann R et al (2003) The influence of processor architecture on the design and the results of WCET tools. Proc IEEE 9(7)
Intel (2007) 3rd Generation Intel Xscale microarchitecture developers’s manual. May 2007. http://www.intel.com/design/intelxscale
ITU (1996) Message sequence charts. ITU-TS Recommendation Z.120
Lee C-G et al. (1998) Analysis of cache-related preemption delay in fixed-priority preemptive scheduling. IEEE Trans Comput 47(6):700–713
Lee JW, Asanovic K (2006) METERG: measurement-based end-to-end performance estimation technique in QoS-capable multiprocessors. In: Proceedings of the IEEE real-time and embedded technology and applications symposium
Li X, Liang Y, Mitra T, Roychoudhury A (2007) Chronos: A timing analyzer for embedded software. Sci Comput Program 69(1–3):56–67. Available at http://www.comp.nus.edu.sg/~rpembed/chronos/
Li Y-TS, Malik S, Wolfe A (1996) Cache modeling for real-time software: beyond direct mapped instruction caches. In: Proceedings of the real-time systems symposium
Liang Y, Mitra T (2010) Instruction cache locking using temporal reuse profile. In: DAC’10: proceedings of the 47th annual design automation conference
Liu T, Li M, Xue CJ (2009) Minimizing WCET for real-time embedded systems via static instruction cache locking. In: RTAS’09: proceedings of the 15th IEEE real-time and embedded technology and applications symposium
Lundqvist T, Stenstrom P (1999) An integrated path and timing analysis method based on cycle-level symbolic execution. Real-Time Syst 17(2–3)
Mueller F (2000) Timing analysis for instruction caches. Real-Time Syst. 18(2–3)
Negi HS, Mitra T, Roychoudhury A (2003) Accurate estimation of cache-related preemption delay. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis
Nemer F, Cassé H, Sainrat P, Bahsoun JP, De Michiel M (2006) Papabench: a free real-time benchmark. In: WCET’06
Puaut I, Decotigny D (2002) Low-complexity algorithms for static cache locking in multitasking hard real-time systems. In: RTSS’02: proceedings of the 23rd IEEE real-time systems symposium
Puschner P, Schoeberl M (2008) On composable system timing, task timing, and WCET analysis. In: International workshop on worst-case execution time analysis
Schliecker S, Negrean M, Nicolescu G, Paulin P, Ernst R (2008) Reliable performance analysis of a multicore multithreaded system-on-chip. In: Proceedings of the IEEE/ACM/IFIP international conference on hardware/software codesign and system synthesis
Staschulat J, Ernst R (2004) Multiple process execution in cache related preemption delay analysis. In: Proceedings of the 4th ACM international conference on embedded software
Suhendra V, Mitra T, Roychoudhury A, Chen T (2006) Efficient detection and exploitation of infeasible paths for software timing analysis. In: Proceedings of the design automation conference
Tan Y, Mooney V (2005) WCRT analysis for a uniprocessor with a unified prioritized cache. In: Proceedings of the 2005 ACM SIGPLAN/SIGBED conference on languages, compilers, and tools for embedded systems
Theiling H, Ferdinand C, Wilhelm R (2000) Fast and precise WCET prediction by separated cache and path analyses. Real-Time Syst 18(2/3)
Tomiyama H, Dutt ND (2000) Program path analysis to bound cache-related preemption delay in preemptive real-time systems. In: Proceedings of the eighth international workshop on hardware/software codesign
Yan J, Zhang W (2008) WCET analysis for multi-core processors with shared L2 instruction caches. In: Proceedings of the IEEE real-time and embedded technology and applications symposium
Zhang W, Yan J (2009) Accurately estimating worst-case execution time for multi-core processors with shared direct-mapped instruction caches. In: Proceedings of the 2009 15th IEEE international conference on embedded and real-time computing systems and applications, RTCSA’09
Acknowledgements
This work was partially supported by Singapore Ministry of Education Academic Research Fund Tier 2, MOE2009-T2-1-033.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Liang, Y., Ding, H., Mitra, T. et al. Timing analysis of concurrent programs running on shared cache multi-cores. Real-Time Syst 48, 638–680 (2012). https://doi.org/10.1007/s11241-012-9160-2
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11241-012-9160-2