Discrete Optimization
Match-up scheduling of mixed-criticality jobs: Maximizing the probability of jobs execution

https://doi.org/10.1016/j.ejor.2017.03.054Get rights and content

Highlights

  • Definition of a new, non regular criterion for the problem of mixed-criticality scheduling.

  • Complexity study and Mixed Integer Linear Programming formulation for the special case with fixed sequence of jobs.

  • Dynamic programming algorithm for a special case (fixed sequence, two criticality levels).

  • Branch and bound algorithm for the general problem and experiments.

Abstract

This paper deals with a mixed-criticality scheduling problem: each job has a criticality level depending on its importance. In addition, each job has a finite set of possible processing times, and a known probability for each of them. Every job must be processed between its release date and its deadline. Moreover, each job has a weight corresponding to its payoff. This problem has applications in single machine scheduling of real time embedded systems scheduling, production and operating theaters.

We propose a model that takes all the possible processing times of a job into account. An offline multilevel schedule is computed such that safety rules are satisfied, in every situation. This is achieved by allowing the rejection of low criticality jobs when higher criticality jobs need longer processing time, at runtime. The runtime schedule is matched-up again with the offline schedule after such deviations from the offline schedule. The offline multilevel schedule optimizes a non-regular criterion aiming to maximize the average weighted probability of jobs execution (i.e., the total expected payoff).

Such a problem is strongly NP-hard. We first study the problem where the sequence of jobs is fixed: we show its complexity and provide a MILP formulation. For the case with two levels of criticality, we provide a dynamic programming algorithm. Finally, we propose a Branch and Bound method for the general problem (i.e., without a fixed job sequence).

Introduction

In classical scheduling, processing times are often assumed to be deterministic and known in advance, which is not always true in reality. It is well known that determining the exact processing times of jobs is a very difficult problem, namely due to the events occurring at runtime. Considering best case (i.e., shortest possible) processing times can lead to a schedule that is dense at runtime and that makes efficient usage of the resource; however deadline constraints may be violated when the actual processing time is longer. On the other hand, considering worst case processing times allows the computation of a schedule that meets safety constraints, but it leads to a schedule that is sparse at runtime, i.e., to a waste of resources.

In mixed-criticality scheduling, first introduced by Vestal (Vestal, 2007), jobs with different criticality levels are distinguished. Considering different criticality levels ensures the satisfaction of safety constraints of high criticality jobs with very high probability, as well as an efficient resource usage (via a schedule that avoids useless idle times and maximizes the (weighted) probability of executed jobs within their time window (defined by a release date and a deadline)). The underlying idea is simple: high criticality jobs are not scheduled side by side, instead they are interleaved with low criticality ones, that can be rejected when the high criticality job processing time is longer at runtime execution. Therefore, a single multilevel schedule is computed offline, which includes several alternatives of the execution that are decided at runtime.

Our model is motivated by safety-critical applications, such as autonomous cars, using so called time-triggered communication networks, where the nodes have synchronized clocks and messages are transmitted at moments defined by the offline schedule (see the static segment of FlexRay protocol (Dvorak & Hanzalek, 2016) used in the automotive industry or the Isochronous Real Time of a Profinet protocol (Hanzalek, Burget, & Sucha, 2010) used in industrial automation, for example). The schedule is repeated periodically, since the functionalities of the car (steering, trajectory planning, engine control loops, computer vision, chassis stabilization, navigation, entertainment, etc.) are periodic. For simplicity, we consider that all jobs have the same period, therefore we can omit the multi-periodic nature of the problem and we can concentrate on one period only (as in the Profinet case). Sensing, computation and actuation performed by the nodes are executed at specific moments within the period, which represent the release date (i.e., availability of the data on the transmitter side) and the deadline (i.e., latest moment when the data is needed on the receiver side) for every message transmitted on the network. Time-triggered communication is characterized by complete determinism, and is, hence, particularly easy to verify and have certified. However, the traditional paradigm offers limited flexibility: once the schedule is computed (prior to runtime), it is not possible to modify it in response to events that may have occurred during runtime execution.

Instead, communications reliability may be increased by message retransmission if the original message was corrupted. The need for retransmission is rare, but it leads to a prolongation of the communication jobs at runtime. Jobs are nonpreemptive, since the particular structure of the messages does not allow resuming their sending after preemption. For this application, the criticality level of a job corresponds to its maximum number of possible (re)transmissions. For example, the Automotive Safety Integrity Level (ASIL) given by ISO 26262 defines four criticality levels, and the DO178-B avionics standard, used by the Federal Aviation Administration, defines five criticality levels. Let us consider the following jobs sharing one resource (i.e., the communication channel) and having three different levels of criticality:

  • Jobs with high criticality (three transmissions: criticality level 3) are used for safety-related functionalities, such as steering and braking

  • Jobs with medium criticality (two transmissions: criticality level 2) are used for mission-related functionalities (their failure or malfunction may prevent a goal-directed activity from being successfully completed, for example an autonomous car will not reach a desired destination), such as combustion engine control or navigation system;

  • Jobs with low criticality (one transmission: criticality level 1) are used for infotainment functionalities, such as a CD player.

In this example the criticality levels are consecutive integers, a situation that does not necessarily occur. For instance, high criticality jobs could be allowed four transmissions: their criticality level would be 4 and there would be zero jobs with criticality level 3.

A solution of the scheduling problem is given by a three-level schedule:

  • Level 1 considers the best-case processing times (i.e., a single transmission of a message) of all jobs,

  • Level 2 omits low criticality jobs and it considers two transmissions of medium criticality and high criticality messages,

  • Level 3 includes high criticality jobs only, each of them represents a message transmitted three times.

Each job is constrained by its release date and deadline. Three different feasible mixed-criticality schedules with three jobs on one resource are shown in Fig. 1. When no retransmission occurs at runtime, the schedule is executed on level 1. The objective is to maximize the weighted probability of jobs execution (detailed in Section 2.2).

Let us consider a production system where a single centralized machine controls all the processes of a workshop. An example of a high criticality job would be related to an important customer whose orders are crucial for the workshop’s future, they are subject to the customer’s audit and they cannot be subcontracted to an external company. A medium criticality job should be performed in-house, but it can be subcontracted. Low criticality jobs can be subcontracted without any impact on the workshop’s reputation. A schedule needs to be robust with respect to the prolongation of processing times, by subcontracting lower criticality jobs when higher criticality jobs need more resource time. The objective is to minimize the subcontracting expenses.

Let us consider a single operating theater which is dedicated to the provision of surgical operations under the uncertainty of their duration (Denton, Miller, Balasubramanian, & R.Huschka, 2010). A cardiovascular operation represents a high criticality job whose duration is not fully deterministic. Nevertheless, the cardiovascular operation needs to be completed even though it implies the rejection of some medium criticality job (such as a hip replacement surgery) or low criticality job (such as a plastic surgery operation). All jobs are constrained by release dates, representing the availability of medical checkups, and deadlines, representing their expiration time. A rejected job can be considered in some future schedules, but in such a case it requires a new medical checkup. The objective is to maximize the revenue of the operating theater.

In general, the purpose of the mixed-criticality scheduling framework is to manage interactions between higher and lower criticality jobs. The problem is to compute a non preemptive schedule S such that, in any case, high criticality jobs are processed by the machine at the corresponding dates. Other jobs are not necessarily processed, but if they are, they are also processed at their corresponding dates. Another specificity of mixed-criticality jobs is that they have several possible processing times. Indeed, high criticality jobs can extend their execution time if needed. If the processing time is not bounded, then we refer to such an instance as invalid.

This policy was introduced by Vestal’s observation that “the more confidence one needs in a task execution time bound, the larger and more conservative that bound tends to be in practice”, therefore, he proposed that different processing time values should be specified for each criticality level of a task; and we adopt this approach as well. The search of a mixed-criticality scheduling is, therefore, aimed towards the construction of scheduling policies that, on the one hand, will facilitate the certification process for high criticality jobs (see Baruah & Fohler, 2011 on certification-cognizant time-triggered scheduling), and, on the other hand, will favour an efficient usage of resources. Following Vestal’s work, the real-time scheduling community has done significant research on event-triggered scheduling for embedded systems (see the review by Burns and Davis (2014)).

In the majority of such existing works, jobs to be scheduled are preemptive, i.e., their execution can be interrupted, and resumed later. Moreover, most of the previous works do not consider the match-up with the original schedule. Their main concern is meeting certifications of high criticality jobs, which is achieved by simply stopping the execution of lower criticality jobs as soon as a higher criticality job prolongation occurs.

In order to fill this gap, Hanzálek, Tunys, and Šů‘cha (2016) have proposed a model that allows a definition of feasible schedules, such that the feasibility of a schedule implies that it meets safety certifications. Therefore, such a space of solutions can be explored in order to maximize the (weighted) number of executed jobs, while satisfying the safety requirements. The model of Hanzálek et al. (2016) is defined for nonpreemptive jobs, and has applications in the scheduling of messages for embedded systems, as well as for production scheduling. The optimization criterion they choose is the classical scheduling criterion Cmax.

In the present work, we consider the same framework as Hanzálek et al. (2016), i.e., the same definition of feasible schedules. However, we consider a different criterion, which instead of favoring left-shifted schedules (i.e., where idle times are avoided if possible by starting job’s execution as soon as possible), favors so-called “spread” schedules, i.e., schedules where idle times can be introduced (cf. Fig. 1). The advantage of such a criterion will be explained more precisely in Section 2.2. The general idea is that introducing more idle time allows more space to absorb unexpected events (such as emergency situations), in a similar way as in match-up scheduling (Akturk, Gorgulu, 1999, Bean, Birge, Mittenthal, Noon, 1991).

The contributions of this paper are the following: the definition of a new optimization criterion for nonpreemptive mixed-criticality scheduling, a complexity study for such a problem as well as for special cases, a MILP formulation for a special case, exact algorithms (dynamic programming and Branch and Bound) for solving the problem, and experiments on the Branch and Bound algorithm. Contributions are more precisely summarized in Table 1 of Section 2.4.

In Section 2, we formally define the problem and we give its complexity. In Section 2.5, we provide a literature review. In Section 3, we discuss the case in which the sequence of jobs is fixed: we show that the problem is NP-hard, we provide a pseudopolynomial time algorithm for the case with two levels of criticality, and a MILP formulation for the problem with an arbitrary number of criticality levels. In Section 4, we provide a Branch and Bound method to solve the general problem. The experiments are described in Section 5. Finally, some conclusions are drawn in Section 6.

Section snippets

Problem definition

The scheduling problem definition consists of two parts: the schedule feasibility (Section 2.1) and the objective function (Section 2.2). The main notations defined here are also listed in Appendix A.

Problem with fixed order of jobs

In this section, we consider the case where the job’s order is known. We first introduce the problem, then show its complexity and finally provide a dynamic programming algorithm. As usual for dynamic programming algorithms, the solution is built from the bottom up. The initial state (partial schedule) is the left-shifted schedule of the first Hi-job and all the preceding Lo-jobs. Starting from this initial schedule, we build all the possible different schedules of the Lo-jobs that are between

Branch and bound

In this section, we propose a Branch and Bound method in order to compute an optimal feasible solution. The Branch and Bound explores (partial) sequences of jobs (cf. Section 4.2). Upper bounds are obtained by computing the best possible execution probabilities of the jobs in the sequence, which are obtained by “pushing” the job to the right as much as possible in order to have the smallest possible coverage for it (cf. Section 4.4). Solutions are evaluated by MILP (cf. Section 4.3). Recall

Experimental evaluation

The Branch and Bound algorithm was tested on randomly generated instances. Section 5.1 describes the instances generator, and Section 5.2 the numerical results.

Conclusion

Starting from the model for mixed-criticality jobs proposed by Hanzálek et al. (2016), we considered here a new, non-regular criterion, in order to maximize job’s probabilities to be executed at runtime. Such a problem is strongly NP-hard. For the special FS2L case with fixed sequence of jobs and two levels of criticality, we provided a pseudopolynomial time dynamic programming algorithm. We showed that FS2L is weakly NP-hard, even with common release dates and deadlines. Moreover FS2L with

Acknowledgments

The authors would like to thank Andrei Furtuna, student at Czech Technical University in Prague, for his implementation of the algorithms. This work was supported by the Ministry of Education of the Czech Republic under the project “Support for improving R & D teams and the development of intersectoral mobility at CTU in Prague” number CZ.1.07/2.3.00/30.0034. This work was supported by the US Department of the Navy Grant N62909-15-1-N094 issued by the Office Naval Research Global. The United

References (31)

  • J.C. Bean et al.

    Matchup scheduling with multiple resources, release dates and disruptions

    Operations Research

    (1991)
  • D. Bertsimas et al.

    Theory and applications of robust optimization

    SIAM Review

    (2011)
  • E. Bini et al.

    Sensitivity analysis for fixed-priority real-time systems

    Real-Time Systems

    (2007)
  • P. Bratley et al.

    Scheduling with earliest start and due date constraints

    Naval Research Logistics Quarterly

    (1971)
  • A. Burns et al.

    Mixed criticality systems: A review

    Tech. rep., technical report

    (2014)
  • Cited by (12)

    • Accelerating the calculation of makespan used in scheduling improvement heuristics

      2021, Computers and Operations Research
      Citation Excerpt :

      The route determines the order of machines that jobs should go through. This paper focuses on choice-free manufacturing systems where the route for each job is fixed and predefined (Seddik and Hanzálek, 2017). A sequence represents the order of jobs on each machine.

    • Scheduling with uncertain processing times in mixed-criticality systems

      2019, European Journal of Operational Research
      Citation Excerpt :

      In these cases, the rejected task might be executed again in few milliseconds in the next period (see, e.g., Dürr et al. (2017) for application to retransmission of communication messages in safety-critical embedded systems). In non-periodic environments, such as production scheduling or scheduling of surgeries in an operating theater (Seddik & Hanzálek, 2017), the low-criticality tasks rejected in the current scheduling horizon are transferred to the following one where they will be scheduled again. Secondly, the rejection of a task occurs rarely, and it is reasonable to assume that in practical applications, we talk about exceptions.

    • Web Based Application for Probability Job Scheduling

      2023, Lecture Notes in Networks and Systems
    View all citing articles on Scopus
    View full text