Skip to main content
Log in

Static analysis of multi-core TDMA resource arbitration delays

  • Published:
Real-Time Systems Aims and scope Submit manuscript

Abstract

In the development of hard real-time systems, knowledge of the Worst-Case Execution Time (WCET) is needed to guarantee the safety of a system. For single-core systems, static analyses have been developed which are able to derive guaranteed bounds on a program’s WCET. Unfortunately, these analyses cannot directly be applied to multi-core scenarios, where the different cores may interfere with each other during the access to shared resources like for example shared buses or memories. For the arbitration of such resources, TDMA arbitration has been shown to exhibit favorable timing predictability properties. In this article, we review and extend a methodology for analyzing access delays for TDMA-arbitrated resources. Formal proofs of the correctness of these methods are given and a thorough experimental evaluation is carried out, where the presented techniques are compared to preexisting ones on an extensive set of real-world benchmarks for different classes of analyzed systems.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Algorithm 1
Algorithm 2
Algorithm 3
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14

Similar content being viewed by others

Notes

  1. Best-Case Execution Time.

  2. In case of out-of-order pipelines, the analysis which we will present in the following would need to consider all orders in which the instructions of basic blocks can possibly be executed in separation and merge this information afterwards to get a valid overapproximation.

  3. Using the arithmetic mean would be infeasible since we are working with relative values here (Fleming and Wallace 1986).

References

  • Aho AV, Lam MS, Sethi R, Ullman JD (2006) Compilers: principles, techniques, and tools, 2nd edn. Addison-Wesley, Reading

    Google Scholar 

  • Altmeyer S, Maiza C, Reineke J (2010) Resilience analysis: tightening the CRPD bound for set-associative caches. In: LCTES ’10: proceedings of the ACM SIGPLAN/SIGBED 2010 conference on languages, compilers, and tools for embedded systems. ACM, New York, pp 153–162. http://rw4.cs.uni-saarland.de/~ reineke/publications/ResilienceAnalysisLCTES10.pdf. doi:10.1145/1755888.1755911

    Chapter  Google Scholar 

  • Andrei A, Eles P, Peng Z, Rosen J (2008) Predictable implementation of real-time applications on multiprocessor systems-on-chip. In: Proceedings of the 21st international conference on VLSI design, VLSID ’08. IEEE Computer Society, Washington, pp 103–110

    Google Scholar 

  • Chattopadhyay S, Roychoudhury A, Mitra T (2010) Modeling shared cache and bus in multi-cores for timing analysis. In: Proceedings of the 13th international workshop on software & compilers for embedded systems, SCOPES ’10. ACM, New York, pp 6:1–6:10

    Google Scholar 

  • Cousot P, Cousot R (1979) Systematic design of program analysis frameworks. In: Proceedings of the 6th ACM SIGPLAN-SIGACT symposium on principles of programming languages (POPL), San Antonio, Texas. ACM, New York, pp 269–282

    Google Scholar 

  • European Space Agency (2012) DEBIE—first standard space debris monitoring instrument. https://gate.etamax.de/edid/publicaccess/debie1.php

  • Fleming P, Wallace J (1986) How not to lie with statistics: the correct way to summarize benchmark results. Commun ACM 29:218–221

    Article  Google Scholar 

  • FlexRay Consortium (2010) FlexRay communications system, protocol specification version 3.0.1. http://www.flexray.com

  • Goossens K, Hansson A (2010) The aethereal network on chip after ten years: goals, evolution, lessons, and future. In: Proceedings of the 2010 design automation conference, Anaheim, California, USA. ACM, New York, pp 306–311

    Google Scholar 

  • Gustavsson A, Ermedahl A, Lisper B, Pettersson P (2010) Towards WCET analysis of multicore architectures using UPPAAL. In: 10th International workshop on worst-case execution time analysis, WCET ’10. Schloss Dagstuhl—Leibniz-Zentrum für Informatik, Dagstuhl, pp 101–112

    Google Scholar 

  • Hardy D, Puaut I (2008) WCET analysis of multi-level non-inclusive set-associative instruction caches. In: Proceedings of the 2008 real-time systems symposium. IEEE Computer Society, Washington, pp 456–466

    Chapter  Google Scholar 

  • Hardy D, Piquet T, Puaut I (2009) Using bypass to tighten WCET estimates for multi-core processors with shared instruction caches. In: Proceedings of the 2009 30th IEEE real-time systems symposium, RTSS ’09. IEEE Computer Society, Washington, pp 68–77

    Google Scholar 

  • Kelter T, Falk H, Marwedel P, Chattopadhyay S, Roychoudhury A (2011) Bus-aware multicore WCET analysis through TDMA offset bounds. In: Proceedings of the 23rd euromicro conference on real-time systems (ECRTS), Porto/Portugal, pp 3–12

    Google Scholar 

  • Lundqvist T, Stenström P (1999) Timing anomalies in dynamically scheduled microprocessors. In: Proceedings of the 20th IEEE real-time systems symposium, RTSS ’99. IEEE Computer Society, Washington

    Google Scholar 

  • Lv M, Guan N, Yi W, Yu G (2010) Combining abstract interpretation with model checking for timing analysis of multicore software. In: 31st IEEE real-time systems symposium (RTSS)

    Google Scholar 

  • Mälardalen WCET Research Group (2012) Mälardalen WCET Benchmark Suite. http://www.mrtc.mdh.se/projects/wcet

  • Mische J, Guliashvili I, Uhrig S, Ungerer T (2010) How to enhance a superscalar processor to provide hard real-time capable in-order SMT. In: Proceedings of the 23rd international conference on architecture of computing systems (ARCS), Hannover/Germany, pp 2–14. doi:10.1007/987-3-642-11950-7_2

    Google Scholar 

  • Muchnick SS (1997) Advanced compiler design and implementation. Morgan Kaufmann, San Mateo

    Google Scholar 

  • Nemer F, Cassé H, Sainrat P, Bahsoun JP, Michiel MD (2006) PapaBench: a free real-time benchmark. In: Mueller F (ed) 6th intl workshop on worst-case execution time (WCET) analysis, Internationales Begegnungs- und Forschungszentrum für Informatik (IBFI). Schloss Dagstuhl, Dagstuhl

    Google Scholar 

  • Paolieri M, Quiñones E, Cazorla FJ, Bernat G, Valero M (2009) Hardware support for WCET analysis of hard real-time multicore systems. In: Proceedings of the 36th annual international symposium on computer architecture, ISCA ’09. ACM, New York, pp 57–68

    Chapter  Google Scholar 

  • Paukovits C, Kopetz H (2008) Concepts of switching in the time-triggered network-on-chip. In: Proceedings of the 14th IEEE international conference on embedded and real-time computing systems and applications, pp 120–129

    Google Scholar 

  • Pellizzoni R, Schranzhofer A, Chen JJ, Caccamo M, Thiele L (2010) Worst case delay analysis for memory interference in multicore systems. In: Proceedings of the conference on design, automation and test in Europe, DATE ’10, pp 741–746

    Google Scholar 

  • Pitter C, Schoeberl M (2010) A real-time Java chip-multiprocessor. ACM Trans Embed Comput Syst 10:9

    Article  Google Scholar 

  • Reineke J, Sen R (2009) Sound and efficient WCET analysis in the presence of timing anomalies. In: Holsti N (ed) 9th intl workshop on worst-case execution time (WCET) analysis. Schloss Dagstuhl—Leibniz-Zentrum für Informatik, Dagstuhl

    Google Scholar 

  • Reineke J, Wachter B, Thesing S, Wilhelm R, Polian I, Eisinger J, Becker B (2006) A definition and classification of timing anomalies. In: Proceedings of 6th international workshop on worst-case execution time (WCET) analysis

    Google Scholar 

  • Skutella M (2009) An introduction to network flows over time. Res Trends Comb Optim. doi:10.1007/987-3-540-76796-1_21

    Google Scholar 

  • Suhendra V, Mitra T (2008) Exploring locking & partitioning for predictable shared caches on multi-cores. In: Proceedings of the 45th annual design automation conference, DAC ’08. ACM, New York, pp 300–303

    Chapter  Google Scholar 

  • Wilhelm R, Engblom J, Ermedahl A, Holsti N, Thesing S, Whalley D, Bernat G, Ferdinand C, Heckmann R, Mitra T, Mueller F, Puaut I, Puschner P, Staschulat J, Stenström P (2008) The worst-case execution-time problem—overview of methods and survey of tools. ACM Trans Embed Comput Syst 7:36:1–36:53

    Article  Google Scholar 

  • Wilhelm R, Grund D, Reineke J, Schlickling M, Pister M, Ferdinand C (2009) Memory hierarchies, pipelines, and buses for future architectures in time-critical embedded systems. IEEE Trans Comput-Aided Des Integr Circuits Syst 28(7):966–978

    Article  Google Scholar 

  • Zhang W, Yan J (2009) Accurately estimating worst-case execution time for multi-core processors with shared direct-mapped instruction caches. IEEE Computer Society Press, Los Alamitos, pp 455–463

    Google Scholar 

Download references

Acknowledgements

This work was partially funded by the European Community’s ArtistDesign Network of Excellence, by the European Community’s 7th Framework Program FP7/2007-2013 under grant agreement no 216008, by the German Research Foundation DFG under reference number FA1017/1-1 and by Faculty Research Council grant T1 251RES0914 (R-252-000-416-112) at NUS.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Timon Kelter.

Appendix: Proofs

Appendix: Proofs

In this appendix you find all proofs of lemma and theorems from the article, except for the Offset Relocation Lemma (Lemma 3), whose proof is discussed in the text due to its novelty.

Lemma 1 For any OO +, u b (O) contains the offsets of all absolute time instants t such that t is the first cycle after the execution of basic block b, starting at an offset oO.

Proof

If a particular execution of the basic block does not access the bus, offexecute from Eq. (4) contains all possible resulting offsets, since ET b is the set of all possible running times then. If the particular execution of the block does access the bus, the block only consists of a single instruction, according to Definition 2. offaccess contains the possible offsets of the first cycle t after the execution of the basic block for any starting offset oO and runtime eET b . Since the only difference between offexecute and offaccess is the application of the Φ p function, we show this by examining the three cases from Eq. (7):

  • In the first case, the access has to be delayed until the start of core p’s slot in the current TDMA hyperperiod.

  • In the second case, the access can be granted immediately, since the bus is allocated to core p and will be allocated to p for at least e cycles.

  • In case three, the access cannot be served in the current TDMA hyperperiod and thus must be delayed to the next TDMA hyperperiod (as shown in Fig. 2).

By taking the union over all possible starting offsets oO and execution times eET b in Eq. (6), the Lemma follows for this case, too. □

Theorem 1 The MOP solution provides a valid overapproximation of all offsets with which block b can be entered.

Proof

m joins the offsets resulting from the single b-traces which represent all execution paths leading to b. We must thus only prove that \(u_{q_{b}} ( S )\) is an overapproximation of the offsets which result from the execution of b-trace q b starting with an offset oS. This can be proven via induction over the length of q b where the induction step is made by applying Lemma 1. □

Theorem 2 For a given interprocedural control flow graph of a task τ and given starting offsets O in , the results \(w \in\mathbb{N}\) and OO + as computed by Algorithm 3 for function \(f^{\mathrm{start}}_{\tau}\) are overapproximations of the WCET and the resulting offsets of any execution of τ which starts with an offset oO in .

Proof

We prove the proposition by structural induction over the interprocedural control flow graph. □

Base case: The smallest possible graph is a single basic block. Therefore, we have to prove the proposition for a single basic block to give the induction base case. According to Definition 2, the basic block either consists of a single instruction which accesses the bus, or of multiple instructions which do not access the bus.

  • A basic block with a bus access

    In this case, the returned WCET is a valid overapproximation since we compute the maximum over all possible completion times as returned by Φ p +e.

  • A basic block without a bus access

    In this case, the returned WCET is a valid overapproximation since we maximize over the given ET b values.

The correctness of the offset result follows from Lemma 1, since the result is computed through a single application of the transfer function u.

Induction step: The induction step must consider the possible structures which can appear in the CFG. We required our interprocedural control flow graphs to be reducible in Sect. 2. A reducible control flow graph can be inductively defined with the patterns shown in Figs. 15 and 16. Every graph which adheres to Definition 9, which includes our control flow graphs, can be constructed using those inductive patterns (Muchnick 1997). In the patterns, the circles indicate reducible subgraphs. For the induction step, we can assume that the proposition was already shown for the subgraphs. We then must prove that the proposition is also true for the depicted graphs as a whole. This is done by looking at the different cases:

  • Sequential patterns

    According to the induction hypothesis, the WCET and offset results for the subgraphs are valid overapproximations. For the sequential case shown in Fig. 15a we add up the WCETs and combine the offset results in lines 9 to 11 of Algorithm 3. This obviously yields overapproximations for the whole sequence.

    Fig. 15
    figure 18

    Sequential structural patterns

    For the case of branches as shown in Figs. 15b and 15c, we compute safe overapproximations, since we take the maximum WCET of any path leading to the end block in line 6 of Algorithm 3. Similarly, we merge the result offsets of all paths reaching the end block in line 7 of the same algorithm. The last sequential case as shown in Fig. 15d is a combination of an if-then with a sequence. Therefore, the correctness for this case follows from the same arguments as in those cases.

  • Cyclic patterns

    The possible cyclic patterns are shown in Fig. 16. We omitted patterns for loops which contain break or continue statements, since the generalization to these cases is a pure technicality. Theorems 4 and 5 from Sect. 5.2 show that our analysis framework correctly overapproximates the WCET and offset sets of loops (cyclic patterns).  □

    Fig. 16
    figure 19

    Cyclic structural patterns

In the proofs of correctness of the proposed “AnalyzeLoop” functions, we can use the induction hypothesis from Theorem 2, that \(\mathit{wcet}_{l}^{LB} ( O )\) and \(u_{l}^{LB} ( O )\) compute valid overapproximations.

Lemma 4

For two offset sets O 1 and O 2 with O 1O 2 we observe that \(\mathit{wcet}_{l}^{LB} ( O_{1} ) \leq \mathit{wcet}_{l}^{LB} ( O_{2} )\) and \(u_{l}^{LB} ( O_{1} ) \subseteq u_{l}^{LB} ( O_{2} )\).

For \(\mathit{wcet}_{l}^{LB}\) this can be derived from the monotony of Φ p and for \(u_{l}^{LB}\) it can be derived from the monotony of the m and u b functions. Thus, \(\mathit{wcet}_{l}^{LB}\) and \(u_{l}^{LB}\) are monotone.

Theorem 3 For given starting offsets O in,l , the global convergence analysis computes safe overapproximations of the loop WCET and result offsets.

Proof

This proof handles the case of cyclic patterns in the proof of Theorem 2 and thus is a plug-in for this proof. If we would set \(O_{in}^{i} = O_{out}^{i-1}\) in the analysis, then we would perform a fully unrolling analysis, which would be unlikely to converge at any time step before the loop bound. The safeness of this fully unrolling analysis then follows from the safeness of the single-iteration analysis which we can assume since this is the induction hypothesis from Theorem 2. We use \(O_{in}^{i} = m ( O_{in}^{i-1}, O_{out}^{i-1} )\), therefore in our algorithm \(O_{in}^{i} \supseteq O_{out}^{i-1}\) holds. Lemma 4 implies that the WCET and offset results which we compute per iteration are overapproximations of the real WCET and offsets. This proves the correctness of the algorithm for the first j loop iterations. Then we have two cases:

  • \(j = B^{max}_{l}\)

    In this case, all loop iterations were analyzed and thus the correctness of the analysis was shown for all loop iterations.

  • \(O_{in}^{j} = O_{in}^{j+1}\)

    In this case, since \(O_{in}^{j}\) is a safe overapproximation of the offsets in loop iteration j and \(O_{in}^{j+1} = u_{l}^{LB} ( O_{in}^{j} )\) is a safe overapproximation of the offsets in loop iteration j+1, the loop can never be entered with offsets \(o \notin O_{in}^{j}\) in any succeeding iteration k>j. Therefore the offset and WCET results for the j-th iteration are safe overapproximations for all \(B^{max}_{l} - j\) remaining iterations.  □

Lemma 2 For a loop l, assume O in,l is an overapproximation on the set of offsets at the entry of the loop before the first iteration. We claim that \(\mathit{reachable}(i) \supseteq O^{\mathrm{real}}_{i}\) is true for all iterations of the loop.

Proof

Let us assume that the construction of the offset graph terminates at iteration m (thus, m is the last iteration of the construction) and the loop bound is i. We prove the proposition by induction over the loop bound.

Base case: We can use the outer induction hypothesis, that the offset results computed by the single-iteration analysis are valid overapproximations. With O in,l being an overapproximation of the input offsets and i=1, this already proves the proposition since only a single loop iteration is modeled then.

Induction step: Due to the induction hypothesis we know that \(\mathit{reachable}(i) \supseteq O^{\mathrm{real}}_{i}\). We must show that \(\mathit{reachable}(i+1) \supseteq O^{\mathrm{real}}_{i+1}\) holds. To accomplish this, we assume that there is an offset \(o_{err} \in O^{\mathrm {real}}_{i+1}\) with o err reachable(i+1). We will show that this leads to a contradiction.

If such an offset o err exists, then by definition of \(O^{\mathrm{real}}_{i+1}\) there must be a possible execution scenario A in which the (i+1)-th loop iteration is entered with offset o err . Let (a 1,a 2,…,a i+1) be the offsets with which the first i+1 iterations of the loop are entered in scenario A. Note that this implies a i+1=o err . Since we assume that o err reachable(i+1), there must be at least two such offsets a p and a q for which \(( v_{a_{p}}, v_{a_{q}} ) \notin E\). Using the induction hypothesis it follows that a p reachable(i) and thus that p=i and q=i+1.

Since a p is reachable in the graph, there must have been a construction iteration j<min(m,i) with \(a_{p} \in O_{out}^{j}\) and \(a_{p} \notin O_{in}^{j}\) where offset a p was reached for the first time. In construction iteration j+1 we add all edges \(E_{j+1} = O_{out}^{j} \times O_{out}^{j+1}\) to the graph. Since \(O_{out}^{j+1} = u_{l}^{LB} ( O_{out}^{j} )\) and \(a_{p} \in O_{out}^{j}\), it follows that \(a_{q} \in O_{out}^{j+1}\) since \(u_{l}^{LB}\) yields a safe overapproximation of the offsets and offset a p is followed by offset a q in scenario A. Therefore, we have \(( v_{a_{p}}, v_{a_{q}} ) \in E\) which is a contradiction. □

Theorem 4 Let us assume \(O^{real}_{in,l}\) is the set of offsets with which loop l may be entered in the first iteration. Given that \(O_{in,l} \supseteq O^{real}_{in,l}\), the graph tracking analysis always computes an overapproximation of the total execution time of the loop.

Proof

We prove this by induction on the loop bound \(B_{l}^{max}\).

Base case (\(B_{l}^{max}=1\): In this case, the objective function (Eq. (22)) simply takes the maximum of c(e) where eE transition and src(e)∈O in,l . Note that for any eE transition , c(e) represents the worst-case execution time of one loop iteration (computed by Algorithm 2) starting at offset src(e). Therefore, \(\max_{src(e) \in O_{in,l}} c(e)\) precisely represents the WCET of the first loop iteration. For \(B_{l}^{max}=1\) the ILP target function (Eq. (22)) is equal to this maximization, which proves the base case.

Induction step: We assume that the WCET computation is sound for loop bound \(B_{l}^{max} = n\). We shall show that the computation is also sound for loop bound \(B_{l}^{max}=n+1\). Let us assume that the actual WCET of the entire loop l with n iterations is denoted by WCET(l,n). On the other hand, the actual WCET of the n-th iteration of the loop is denoted by WCET iter (l,n). According to the graph tracking analysis, we compute the WCET of the loop with n+1 iterations as

$$\begin{aligned} \max\sum_{e \in E}\sum_{t \in T} c(e)x(e,t) \end{aligned}$$
(30)

where E is the set of all edges in the offset graph and T={0,…,n+1}. However,

$$\begin{aligned} \max\sum_{e \in E} \sum_{t \in T} c(e)x(e,t) = \max\sum _{e \in E}\sum_{t \in T'}c(e)x(e,t) + \max \sum_{e \in E}c(e)x(e,n+1) \end{aligned}$$
(31)

where T′={0,…,n}. By induction hypothesis, we have

$$\begin{aligned} \max\sum _{e \in E}\sum_{t \in T'}c(e)x(e,t) \ge \mathit{WCET}(l,n) \end{aligned}$$
(32)

From Lemma 2 we know that \(\mathit{reachable}(n+1) \supseteq O^{\mathrm{real}}_{n+1}\). If an offset node is not reachable in iteration n+1, then it cannot contribute to Eq. (31), therefore

$$\begin{aligned} \max\sum_{e \in E}c(e)x(e,n+1) &= \max\sum _{e \in \{ (v,w) \in E | v \in reachable(n+1) \}}c(e)x(e,n+1) \end{aligned}$$
(33)
$$\begin{aligned} &\ge\max\sum_{e \in \{ (v,w) \in E | v \in O^{\mathrm {real}_{n+1}} \}}c(e)x(e,n+1) \end{aligned}$$
(34)
$$\begin{aligned} &= \mathit{WCET}_{iter}(l,n+1) \end{aligned}$$
(35)

Inserting Eqs. (35) and (32) into Eq. (31) provides the induction step. Thus, the proposition is proven. □

Theorem 5 Computation of O out,l is sound. More precisely, O out,l predicted by the graph tracking analysis always overapproximates the set of offsets with which a loop may be left.

Proof

We are sending s l n c flow units through the graph. Each one of these units models an independent execution of the loop. Each of these modeled executions (say they are numbered with i∈{1,…,n c s l }) will exit the loop with some offset o end,i . The unknown set of all possible exit offsets is K. What we must show, is that K⊆{o end,i |i∈{1,…,n c s l }}.

What we maximize in Eq. (23) is the cardinality of the set of offsets with which the s l n c flow units exit the loop. By Lemma 2 the reachable offsets in the flow graph are an overapproximation of the reachable offsets in the real loop execution for all iterations \(j \in\{1,\ldots,B_{l}^{max}\}\). Therefore, if the loop can be left in iteration \(k \in\{B_{l}^{min}, \ldots, B_{l}^{max}\}\) with offset o left during a real loop execution, then it is possible to construct a flow with one flow unit i which starts at v + at time 0 and takes the edge \(e = (v_{o_{left}}, v^{-})\) at time step k, thus o end,i =o left for a given o left .

Up to this point we have then shown, that for each exit offset o left K we can construct a flow with one flow unit that exits the loop with this offset. It is also possible that we get flows which end with offsets o err K, but that is no problem since we only require an overapproximation of the offsets. If we now assume that we compute a solution O out,l with an offset kK and kO out,l , then we can easily show that this is a contradiction:

  1. 1.

    |O out,l |=s l n c

    In this case, the set O out,l represents all possible offsets, therefore an offset kO out,l cannot exist.

  2. 2.

    |O out,l |<s l n c

    In this case, there must be at least two flow units i and j with o end,i =o end,j , since we used F=n c s l flow units in total. Since kK holds, there exists a valid flow f through the graph which exits the loop with offset k (as shown in paragraph 2). If we let one of the flow units, say i, follow that flow f instead of the flow which it followed in the original solution, then we get a new solution to the flow problem in which |O out,l | is increased by 1, compared to the previous solution. Since the original solution to the flow problem must have been maximal with respect to |O out,l |, this is a contradiction.  □

Corollary 1 The analysis framework, using the graph tracking analysis, provides overapproximations of the WCET of any task τ executed with starting offset O in,τ on our assumed platform.

Proof

Theorems 4 and 5 provide the missing induction step case for the proof of Theorem 2. The WCET for the \(f^{\mathrm{start}}_{\tau}\) function is the WCET of the task. Following Theorem 2, our analysis framework together with the graph tracking analysis produces valid WCET overapproximations for this function and thus also for the task. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kelter, T., Falk, H., Marwedel, P. et al. Static analysis of multi-core TDMA resource arbitration delays. Real-Time Syst 50, 185–229 (2014). https://doi.org/10.1007/s11241-013-9189-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11241-013-9189-x

Keywords

Navigation