1 Introduction
Response-time analysis is crucial in designing real-time systems. With over 80% of industrial systems now relying on multicore platforms [
1], accurate analysis of global scheduling policies, such as global fixed-priority and EDF, is becoming increasingly important. Supported by operating systems like Linux, RTEMS, and VxWorks, global scheduling improves load balancing and reduces average response times. However, these benefits can only be achieved for real-time systems if we can also guarantee that timing requirements will be met. This requires a response-time analysis that provides accurate bounds on the response times of tasks and that scales effectively with the size of the system.
Related work. Most schedulability analyses for global scheduling policies rely on the quantification of carry-in workload and the concept of busy-window [
3,
4,
6,
7,
20,
22,
34,
36,
38,
39]. However, as demonstrated in [
8,
18,
24,
25], these analyses tend to be overly pessimistic, especially for larger systems with more tasks or cores.
The literature also includes exact schedulability tests using timed automata [
14,
19,
37], linear hybrid automata [
33,
35], simulation-based tests [
12], reachability analysis via timed-labeled transition systems [
17], and other reachability-based methods [
5,
9,
10,
11]. Despite steady progress in scalability, these analyses face limitations as the number of tasks or cores increases as shown in [
18,
25,
31].
The schedule-abstraction graph (SAG) technique, introduced in [
23,
24,
25,
28] is a recent reachability-based response-time analysis that systematically explores the decision space of global job-level fixed-priority (JLFP) scheduling policies. It constructs a directed acyclic graph (DAG) with vertices representing reachable system states and edges representing a scheduling decision (dispatching of a job) that evolves a system state into another. This technique provides highly accurate response-time bounds by automatically identifying interference scenarios under global scheduling. SAG has been applied to preemptive [
18] and various non-preemptive scheduling problems [
15,
16,
24,
25,
26,
27,
31,
32].
Despite its accuracy and relatively good scalability, the size of a schedule-abstraction graph can grow exponentially when tasks have large release jitter or execution time variations. This growth occurs because the SAG technique explores one scheduling decision at a time, with each edge in the graph representing only a single job. To address this limitation, Ranjha et al. [
28,
29,
30] proposed a
partial-order reduction (POR) technique for single-core platforms, which combines multiple scheduling decisions during the exploration of the SAG graph by looking ahead into their collective completion time on the processing core. They used a set of heuristics to derive a safe but quick upper bound for the completion time of these jobs. Their technique proved to be very efficient for single-core platforms, reducing the runtime of the SAG by
five orders of magnitude.
However, the POR in [
28,
29,
30] cannot be directly applied to multicore platforms. Identifying ‘safe’ scenarios for analyzing multiple jobs simultaneously requires fast yet accurate sufficient schedulability tests to ensure no deadline misses among batched jobs. However, the existing ‘fast’ busy-window-based tests are overly pessimistic, often rejecting potentially safe batches. Moreover, Ranjha’s analysis [
28,
29,
30] is designed for single-core platforms and does not
account for the inherent parallelism offered by global scheduling policies on multicore platforms, where some ready tasks can run independently on different cores and therefore do not interfere with each other.
This paper. We introduce a state-space exploration strategy to reduce the size of the schedule-abstraction graph (SAG) and apply it to the SAG analyses for both preemptive [
18] and non-preemptive [
25] global job-level fixed-priority (JLFP) scheduling policies (e.g., G-EDF or G-FP). This state-space reduction technique is a form of
partial-order reduction (POR) and exploits the inherent parallelism in task execution under global policies. It enables us to identify independent sets of jobs that can execute on different cores without interfering with each other. To effectively implement this technique, we address three research questions.
(RQ1)
How many jobs can start (or resume) their execution independently on the platform in a given system state?
(RQ2)
Which set(s) of jobs may start or resume their execution independently in a given system state?
(RQ3)
What will be the state of the system after dispatching these jobs?
By answering the above questions, we propose a new exploration strategy for the SAG that reduces the number of explored states while maintaining or sometimes improving the accuracy of the SAG analysis according to our empirical evaluations. Our experimental results also demonstrate that for preemptive task sets scheduled by global EDF – one of the most time-consuming problems to analyze with the SAG – our technique reduces the average analysis runtime by 157.3 times (e.g., for systems with 6 to 20 tasks, 60% utilization, and 4 cores).
2 System model and definitions
We focus on deriving response time bounds for a finite set of independent jobs \(\mathcal {J}\) with arbitrary release times on a homogeneous multicore platform with m cores. For example, \(\mathcal {J}\) may represent all jobs released by periodic or multiframe tasks within one or multiple hyperperiod(s) (the least common multiple of the task periods).
A job
\(j \in \mathcal {J}\) is defined by its earliest release time
rmin (
j) (also known as
arrival time in Audsley’s terminology [
2]), latest release time
rmax (
j), absolute deadline
d(
j), best-case execution time (BCET)
Cmin (
j), worst-case execution time (WCET)
Cmax (
j), and priority
p(
j) which is assigned by the scheduling policy. We assume that all the job timing parameters are integer multiples of the processor clock.
We consider a system that employs a global work-conserving job-level fixed-priority (JLFP) scheduling policy, which includes policies such as earliest-deadline first (EDF) and fixed-priority (FP). For example, with EDF, the absolute deadline of a job is also its priority, i.e., p(j) = d(j). We assume that a lower numerical value for p(j) indicates a higher priority, with priority ties broken in an arbitrary but consistent manner. We assume the “ < ” operator implicitly follows this tie-breaking rule.
We assume the scheduling policy can either be preemptive or non-preemptive. If the scheduling policy is non-preemptive, then a job that starts executing will always run to completion, potentially generating blocking for higher-priority jobs. If, on the other hand, the scheduling policy is preemptive, then a lower-priority job may be preempted as soon as a higher-priority job is released.
For ease of notation, we define min ∞{X} and max 0{X} for a set of positive integers X as follows: if X ≠ ∅, then max 0{X} = max {X} and min ∞{X} = min {X} ; else if X = ∅, then max 0{X} = 0 and min ∞{X} = ∞.
3 Schedule-abstraction graph
The SAG analysis constructs a reachability graph representing the system states reachable when considering all possible job dispatch orderings under a given JLFP policy [
18,
25]. Each system state
v explored during the construction of the SAG tracks the cores availability using
m uncertainty intervals, each indicating the earliest and latest times by which
x cores (where
x ranges from one to
m) become simultaneously free. This core availability representation leverages the symmetry of identical cores to reduce the state space. The system state also records information on completed and preempted jobs as follows (see [
18,
25]):
•
Core availability intervals. These intervals A1(v), A2(v),..., Am(v) denote the times at which one, two,..., or m cores may become simultaneously free. The bounds of an interval \(A_x(v) = [A_x^{\min }(v), A_x^{\max }(v)]\) (1 ≤ x ≤ m) specifies the earliest time x cores are potentially available (\(A_x^{\min }(v)\)) and the earliest time x cores are certainly available at once (\(A_x^{\max }(v)\)).
•
Set of completed jobs. \(\mathcal {J}^{\mathit {Co}}(v)\) contains jobs completed before reaching state v. Thus, \(\mathcal {J}\setminus \mathcal {J}^{\mathit {Co}}(v)\) represents the set of jobs that have not yet been dispatched or have been preempted and therefore may appear in the scheduler ready queue in state v.
For preemptive scheduling policies, each state also includes:
•
Set of preempted jobs. \(\mathcal {J}^{\mathit {Pr}}(v)\) contains the set of jobs that have started but not yet completed their execution. In any state v, \(\mathcal {J}^{\mathit {Pr}}(v) \subseteq \mathcal {J}\setminus \mathcal {J}^{\mathit {Co}}(v)\).
•
Finish times of preempted jobs. The SAG assumes that a preemption may divide a job into virtual segments: one executed before the preemption and one after. For each preempted job
\(j \in \mathcal {J}^{\mathit {Pr}}(v)\), it stores the finish time
FT(
j,
v) of the latest executed segment
1.
•
Remaining execution times of preempted jobs. The remaining execution time of each preempted job \(j \in \mathcal {J}^{\mathit {Pr}}(v)\) is stored in RM(j, v) = [RMmin (j, v), RMmax (j, v)], where RMmin (j, v) and RMmax (j, v) represent the best-case and worst-case remaining execution times of job j in state v, respectively.
Construction of the graph. SAG uses a breadth-first strategy to explore the state-space of possible scheduling decisions in each reachable system state. To do so, it expands each reachable state in a certain depth of the graph and then merges states where the future can be explored together. During expansion, it selects the state with the fewest completed jobs and identifies all potential jobs that may be the first job dispatched in that state, i.e., all jobs that may be at the head of the ready queue in that state. Each of these jobs labels an edge representing a scheduling decision that transitions the current state to a new one.
To decide which jobs may be dispatched next in a system state v, the SAG computes for each job j that has not yet completed its execution, the SAG computes bounds on the earliest start time (EST(j, v)) and the latest start time (LST(j, v)) of j in state v assuming j is the next job dispatched on a core by the scheduler. If EST(j, v) ≤ LST(j, v), then there exists an execution scenario where j is the first job dispatched for execution in system state v. Thus, j is added as the label of an edge evolving state v to a new state. If EST(j, v) > LST(j, v), then there is no scenario such that j may be the first job to start in state v.
The SAG calculates the bounds on EST(j, v) and LST(j, v) considering the work-conversing and JLFP properties of the underlying scheduling policy. Namely, it compares the time when the first core becomes free in sate v with the time at which j and any higher priority job than j becomes ready in system state v.
4 Motivation and challenges
Current shortcomings. Current state-space exploration of the SAG for multicore platforms [
18,
25] assesses a job’s eligibility for dispatch in a system state
v by considering the availability interval of a single core (i.e.,
A1), without factoring in the total number of available cores. This approach results in accounting for the dispatch of jobs one at a time, which in turn leads to generating far more states and edges in the SAG than necessary to capture interference scenarios as we show in the following example.
Consider the set of four jobs with release jitter given in Fig.
1 a. Assume they are scheduled with non-preemptive EDF on three cores. Fig.
1 b shows the SAG generated by the existing exploration strategy. Assume that all three cores are initially free at time 0. Since, due to their release jitters,
j1,
j2, and
j3 can all be the first job released, they can all be the first job dispatched and the SAG generates one edge for each of them after the initial state
v1. In total, it explores 7 states (
v2 to
v8) to account for all possible orderings of jobs dispatch decisions taken by the scheduling policy until reaching the state where the three jobs
j1,
j2, and
j3 have started executing (
v8). The current SAG analysis framework does not leverage the fact that all cores are available at time 0 and the three jobs
j1,
j2, and
j3 can execute in parallel on those cores. In contrast, our new exploration strategy presented in the next section detects that the start time of
j1,
j2, and
j3 are independent of each other as they certainly run on different cores. It uses that fact to bypass the exploration of states
v2 to
v7 and directly jump to state
v8, thus generating a much smaller SAG as shown in Fig.
1 c.
Our approach fundamentally differs from previously proposed partial-order reduction techniques for single-core SAG analysis [
28,
29,
30]. While those POR methods combine scheduling decisions (dispatching of jobs) as long as they do not result in a deadline miss, it does not identify
non-interfering jobs as in the method we present in this paper. Instead, they focus on
interfering jobs, fast-forwarding the timeline to a future state where all those jobs have been completed. Moreover, as discussed in Sec.
1, Ranjha’s technique is not applicable to multicore platforms because there is no method that is both fast and
accurate to estimate the completion time of a set of interfering jobs under global scheduling. Our strategy avoids this issue by focusing on identifying and analyzing non-interfering jobs by leveraging the parallelism of multicore platforms.
Challenges. While Fig.
1 c illustrates the core concept of our state exploration strategy, it does not fully capture the complexities of identifying a set of independent jobs and creating a new state in the graph. Consider another example (Fig.
2) where the SAG is midway through analyzing a system with three cores, needing to identify the next batch of eligible jobs to dispatch. Fig.
2 a shows the set of jobs that have not yet been dispatched in state
v4 (shown in Fig.
2 b).
The first challenge RQ1 is to determine the number of jobs that can be executed in parallel without interfering with one another. Due to timing uncertainties, the exact time each core becomes free is unknown, leaving the exact number of simultaneously available cores uncertain. Additionally, release jitter makes the exact number of pending jobs unpredictable too. For example in Fig.
2, at time 30, either 2 or 3 cores might become available, and any combination of jobs
j1,
j2, and
j3 could be released, resulting in 8 possible scenarios.
The second challenge RQ2 is to determine which jobs can be dispatched together in a batch without causing interference with one another while avoiding the exhaustive exploration of all possible dispatch orders. For instance, job j4 cannot be batched with job j1 because jobs j2 and j3 might be released between the dispatches of j1 and j4, and therefore indirectly interfere with j4.
The final challenge RQ3 is to define the new system state after dispatching a batch of jobs. For example, after dispatching a batch containing jobs j1 and j2 in state v4, we must update the core availability intervals, adjust the remaining execution times of preempted jobs, and perform other state modifications for a batch of jobs.
The next section explains how we address these challenges.
6 Empirical Evaluation
We conducted experiments to examine the impact of system utilization (U), the number of tasks (n), the number of cores (m), and release jitter on the effectiveness of our exploration method.
Baseline. We compare our new exploration method for both non-preemptive and preemptive task sets scheduled by global EDF against the traditional SAG exploration methods, i.e.,
SAG-NP [
25] for non-preemptive (code version 2.3.0
2), and
SAG-Pr [
18] for preemptive task sets (code version 1.1.0
3).
Evaluation platform. All methods were executed as a single-threaded C++ program on a computing cluster powered by AMD Rome 7H12 processors clocked at 2.6GHz with 1TB of memory. The runtime of each analysis was reported by measuring its CPU time.
Task set generation. We generated synthetic periodic task sets, where the period
Ti of a task in the task set was randomly chosen from the interval 10, 000
μs to 100, 000
μs with a log-uniform distribution (following Emberson’s method [
13]). We then used
RandFixedSum [
13] to generate random utilizations
Ui for each task that sums to the target system utilization for each task set. The WCET of a task is then given by
Ui ×
Ti. For each task, we assumed a release jitter of 20
μs, a BCET to be 80% of the WCET, and a deadline equal to the period. We discarded task sets with more than 100,000 jobs per hyperperiod (note that industrial task sets usually have only a few thousand jobs in their hyperperiod [
21]).
Each experiment considers 200 randomly generated task sets per data point. Each analysis method is allocated a maximum time budget of 8 hours to complete the analysis.
Experiment1 (impact of utilization). We consider a system with 4 cores, 6 tasks per task set, and vary the total utilization
U from
\(30\%\) to
\(90\%\). As shown in Fig.
3 a, in both preemptive and non-preemptive analyses, we observe a decrease in the schedulability ratio as utilization increases. For the non-preemptive analysis, our new state-space exploration detects the same number of schedulable task sets as
SAG-NP. However, for the preemptive analysis, our method shows slightly different schedulability over
SAG-Pr. For instance, at
\(U=50\%\), our approach detects 97% of task sets as schedulable, compared to 99.5% with
SAG-Pr, and at
\(U=80\%\), our method shows a slight improvement and detects 36% of task sets as schedulable, compared to 23.5% with
SAG-Pr.
We attribute these differences to fewer system states generated by our new exploration method. Consequently, in some cases it causes fewer states to merge, which is an important cause of pessimism in the analysis.
The reduction in number of states is confirmed in Fig.
3 c, which shows the average number of system states generated by each analysis. Our exploration method on average reduces the number of states by 46% compared to
SAG-NP, and by 81% compared to
SAG-Pr, across all data points.
Fig.
3 b shows that the average runtime of both our new and the original SAG exploration methods is less than a few seconds for both preemptive and non-preemptive systems. However, our new method consistently demonstrates faster performance compared to the original SAG.
Fig.
3 d depicts the runtime for each task set as a function of the number of jobs in their hyperperiod. As shown, generally, preemptive task sets are slower than non-preemptive task sets, and by increasing the number of jobs in the hyperperiod the runtime increases.
Experiment2 (impact of the number of tasks). We consider a system with 4 cores, total utilization of 60%, and vary the number of tasks
n from 6 to 20. As shown in Fig.
3 e, the schedulability of non-preemptive tasks generally increases as the number of tasks grows due to a reduction in each task’s utilization and therefore execution time (which leads to shorter blocking time/interference for other tasks). Our method identifies 33.62% of non-preemptive task sets as schedulable, compared to 33.68% identified by
SAG-NP. More precisely, out of 1,600 task sets, our new exploration method missed just one of the schedulable sets identified by
SAG-NP.
For preemptive task sets, there is no schedulability difference between our exploration method and SAG-Pr for task sets with 8 to 14 tasks. However, as the number of tasks increases, SAG-Pr often sees a timeout before reaching a conclusion, whereas our method completes the analysis within the 8-hour limit (taking less than 18 seconds on average). Overall, our analysis identifies 96.75% schedulable task sets, compared to 93.81% identified by SAG-Pr.
These observations confirm that our new exploration method tends to identify more schedulable task sets than the original analysis with very rare cases of missing schedulable task sets (in 5 cases out of 3200 tested sets in this experiment for both preemptive and non-preemptive tasks).
Fig.
3 f demonstrates the average runtime of the analyses as a function of the number of tasks for preemptive and non-preemptive task sets. It can be seen that the runtime increases as the number of tasks grows, but our technique nearly consistently has a smaller runtime than the original SAG. This difference is noticeable for both preemptive and non-preemptive task sets, where our technique cuts the average runtime of preemptive analysis by 157 times compared to
SAG-Pr, reducing it from 2826.32 seconds (≈ 47 minutes) to 17.96 seconds. Moreover, for non-preemptive task sets, our method reduced the runtime by 145 times compared to
SAG-NP, reducing it from 529.44 seconds (≈ 8 minutes) to 3.65 seconds.
The runtime reduction is mainly due to the reduced number of system states explored by our method, particularly for large preemptive task sets (with many tasks or many jobs), which are hardest to analyze for the current SAG analysis. Fig.
3 g shows the average number of system states generated by each analysis. Our exploration method on average reduces the number of states by 7% compared to
SAG-NP, and by 10% compared to
SAG-Pr, across all data points. Furthermore, Fig.
3 h shows the average number of edges in the explored graph for each analysis. As shown in Fig.
3 h, the average number of edges—an important indicator of the total number of generated states, both merged and unmerged—is consistently smaller than in the original SAG.
Experiment3 (impact of the number of cores). We varied the number of cores (
m) for systems with
\(U=60\%\) and
n = 1.5 ×
m tasks. As shown in Fig.
3 i, for non-preemptive task sets, the schedulability generally decreases for both our solution and
SAG-NP as the number of cores increases. In contrast, for preemptive task sets, the schedulability of our method remains unaffected by the increase in cores, while
SAG-Pr experiences a decline due to reaching the timeout limit. Our analysis, however, was completed successfully within the 8-hour timeout. Overall, in this experiment, both methods detect 21.25% of the task sets schedulable for non-preemptive systems. For preemptive systems, however, our method detects 97.5% of task sets schedulable, while
SAG-Pr detects 81.5%.
Fig.
3 j shows that the average runtime increases as the number of cores rises. However, for preemptive systems,
SAG-Pr experiences a much steeper increase in runtime compared to our method. For instance, with 6 cores, our analysis takes an average of 1 second, while
SAG-Pr requires 354 seconds (≈ 6 minutes) to complete. When the number of cores is increased to 10, our analysis takes 280 seconds (≈ 5 minutes), whereas
SAG-Pr requires 12,052 seconds (≈ 200 minutes).
As in the previous experiment, the reduction in runtime is directly related to the number of states and edges, as shown in Fig.
3 k and Fig
3 l. Overall across all data points, for non-preemptive systems, our method generates 66% fewer states than
SAG-NP, while for preemptive systems, it produces 7% more states compared to
SAG-Pr because of the less merging and exploring more states before reaching a timeout. On the other hand, overall, our method on average across all data points has 37% fewer edges compared to
SAG-NP for non-preemptive systems and 7% fewer edges compared to
SAG-Pr for preemptive systems. It is worth mentioning that our method generates more states and edges with 12 cores compared to
SAG-Pr because, in 60% of task sets,
SAG-Pr times out before fully exploring all states and providing conclusive results.
Experiment4 (impact of release jitter). We consider a system with 4 cores, 6 tasks per task set, 60% utilization, and vary the release jitter for tasks.
Fig.
4 a shows that schedulability decreases as the release jitter increases. In this experiment, for non-preemptive systems, our method detects 16.875% of the task sets as schedulable, which is the same as
SAG-NP. For preemptive systems, our method identifies 76.625% of the task sets as schedulable, compared to 80% detected by
SAG-Pr.
As shown in Fig.
4 b, the runtime generally remains constant as the release jitter increases. However, our method consistently outperforms the original SAG in both non-preemptive and preemptive systems. Specifically, for non-preemptive systems, our method completes in 0.002 seconds, while
SAG-NP takes 0.06 seconds. For preemptive systems, our method finishes in 0.06 seconds, whereas
SAG-Pr requires 11.06 seconds. Fig.
4 c shows the average number of states generated by each method. Similar to the trend in previous experiments, our method generates fewer states compared to the original SAG across all data points.