Scheduling DAGs Opportunistically: The Dream and the Reality Circa 2016

Rosenberg, Arnold L.

doi:10.1007/978-3-319-43659-3_2

Arnold L. Rosenberg¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 9833))

Included in the following conference series:

European Conference on Parallel Processing

2407 Accesses

Abstract

A broad-brush tour of a platform-oblivious approach to scheduling dag-structured computations on platforms whose resources can change dynamically, both in availability and efficiency. The main focus is on the IC-scheduling and Area-oriented scheduling paradigms—the motivation, the dream, the implementation, and initial work on evaluation.

You have full access to this open access chapter, Download conference paper PDF

Scheduling Data-Intensive Workloads in Large-Scale Distributed Systems: Trends and Challenges

Workload Type-Aware Scheduling on big.LITTLE Platforms

The shape of a DAG: bounding the response time using long paths

Article 25 May 2023

Keywords

1 Prehistory

Early this century, Fran Berman, then-director of the San Diego Supercomputing Center (SDSC), gave a distinguished lecture at my then-home institution, UMass-Amherst. During a subsequent one-on-one, Fran educated me about a Grid-consortium that SDSC participated in, jointly with several kindred centers. The consortium “contract” allowed any member institution to submit computing jobs to any other. There was a guarantee that submitted jobs would be completed—but not when. When I asked what kind of computations SDSC performed using this paradigm, I was shocked to learn that the computations had dependencies among subcomputations that constrained the order in which work could be done. (As I recall, these were wavefont-structured dependencies.) I asked Fran how her team coped with the possibility that work could grind to a halt pending the completion of jobs that had been deployed within the consortium but not yet completed. Fran responded that they used heuristics that seemed to work well—but that she did not know of any mathematical setting that would allow one to think about this situation rigorously. The challenge was irresistible!

2 The Dream of Opportunistic Scheduling

2.1 An Informal Overview

Many modern computing platforms—notably including clouds [26, 27], desktop grids [2], and volunteer-computing projects [11, 15]—exhibit extreme levels of dynamic heterogeneity. The availability and relative efficiencies of such platforms’ computing resources can change at unexpected times and in unexpected ways. Scheduling a computation for efficient execution on such a platform can be quite challenging, particularly when there are dependencies among the computation’s constituent chores ^{Footnote 1} (jobs, tasks, etc.). We wanted to take up this challenge for the traditional scheduling setting of computations whose dependencies had the structure of dags (directed acyclic graphs).

The nodes of a computation-dag $\mathcal G$ represent chores to be executed; $\mathcal G$’s arcs (directed edges) represent inter-chore dependencies that constrain the order in which chores can be executed. Specifically, a node v cannot be executed until all of its parents have been: these are the nodes that have arcs into v. Once all of v’s parents have been executed, v becomes eligible (for execution) and remains so until it is executed. $\mathcal G$ has one or more sources—nodes that have no parents, hence are immediately eligible—and one or more sinks—nodes that have no “children.” Clearly, executing a non-sink renders new nodes eligible. The execution of $\mathcal G$ terminates once all nodes have been executed.

2.2 Opportunistic dag-Execution via Platform-Oblivious Scheduling

Recent studies have proposed seeking high performance and low cost within platforms that are dynamically heterogeneous and/or elastic by scheduling computations in a platform-oblivious manner. One compensates for ignoring platform details by carefully exploiting the detailed characteristics of one’s computation. The central thesis motivating this approach is that, particularly with the targeted platforms, one always benefits computationally with dag-structured workflows by enhancing the likelihood of having as many eligible chores as possible. Such scheduling enhances the likelihood of having work available as (advantageous) resources become available, hence being able to exploit resources opportunistically. Platform-oblivious scheduling can be advantageous for the targeted platforms because it exploits unchanging, perfectly-known characteristics of one’s computation rather than attempting to adapt to characteristics of the platform, which are at best imperfectly known and, indeed, may change dynamically.

As we have pursued the dream of high-performing platform-oblivious schedules, we have found it technically advantageous to follow the lead of work-centric systems such as CHARM++ [14], by refining input dags before scheduling. We thereby can focus on scheduling fine-grained dags whose chores are all of (roughly) equal complexity. This focus extrapolates easily to dags that represent heterogeneous workloads: one simply models large chores as chains of “unit-size” ones with sequential dependencies, in the manner discussed in [9].

3 The Reality

3.1 Formalizing the Dream

A schedule $\varSigma $ for a dag $\mathcal G$ is a topological sort [10] of $\mathcal G$, i.e., a linear ordering of $\mathcal G$’s nodes in which all parents of each node v lie to the left of v. The schedule prescribes the order in which $\mathcal G$’s nodes are selected for execution. For any schedule $\varSigma $ for $\mathcal G$ and any integer $T \in [0..N_{\text{ G }}]$,^{Footnote 2} $E_\varSigma (T)$ denotes the number of nodes of $\mathcal G$ that are eligible for execution at step T when $\varSigma $ executes $\mathcal G$.

A. ICO Quality and Optimality [21]. Our first quality measure for dag-schedules embodies the strictest possible interpretation of “eligible-node enhancement.” We measure the IC quality of an execution of $\mathcal G$ by the number of nodes that are eligible after each node-execution—the more, the better. (Note that we measure time in an event-driven manner, as the number of nodes that have been executed to that point.) Our goal is to execute $\mathcal G$’s nodes in an order that maximizes the production rate of eligible nodes at every step of the execution, i.e., to craft a schedule $\varSigma ^\star $ such that

$$\begin{aligned} (\forall t) \ \ E_{\varSigma ^\star }(t) \ = \ \max _{\varSigma \text{ a } \text{ schedule } \text{ for } \mathcal{G}} \{ E_{\varSigma }(t) \}. \end{aligned}$$

(1)

A schedule for $\mathcal G$ that achieves this demanding goal is IC optimal (ICO, for short).

In Sect. 3.2.A, we discuss ICO schedules for many classes of significant “real” computations—surprisingly many, given the strictness of the condition in Eq. 1.

B. AREA Quality and Optimality [3]. As we detail in Sect. 3.2.A, the demands of Eq. 1 are so stringent that many dags do not admit ICO schedules. This led us to weaken the IC-scheduling paradigm in [3], by introducing the Area-oriented dag-scheduling paradigm.

Let $\varSigma $ be a schedule for dag $\mathcal G$. The Area, $Area(\varSigma )$, of $\varSigma $, is the sum

$$ Area(\varSigma ) \ = \ E_\varSigma (0) + E_\varSigma (1) + \cdots + E_\varSigma (N_{\text{ G }}). $$

Note that schedule $\varSigma $’s normalized Area—obtained by dividing $AREA(\varSigma )$ by the number of nodes in $\mathcal G$—is the average number of nodes that are eligible as $\varSigma $ executes $\mathcal G$. (The term Area is by analogy with Riemann sums approximating integrals.) Our goal is to find, for each dag $\mathcal G$, an Area-maximal schedule, i.e., a schedule $\varSigma ^\star $ for $\mathcal G$ such that

$$\begin{aligned} Area(\varSigma ^\star ) \ = \ \max _{\varSigma \text{ a } \text{ schedule } \text{ for } \mathcal{G}} Area(\varSigma ). \end{aligned}$$

(2)

A schedule for $\mathcal G$ that achieves this goal is Area-optimal (A-O, for short).

Easily, every dag admits an A-O schedule. Importantly for our dream, the A-O scheduling paradigm is a strict extension of the ICO paradigm, in the following sense.

Theorem 1

([3]). If dag $\mathcal G$ admits an ICO schedule $\varSigma $, then every ICO schedule for $\mathcal G$ is A-O, and vice versa.

C. Optimal Schedules via dag-duality. An important “meta-scheduling” contribution appears in [6] for ICO scheduling and in [3] for A-O scheduling. In both cases, one finds an algorithm that converts an optimal ICO (resp., A-O) schedule for a dag $\mathcal G$ to an optimal ICO (resp., A-O) schedule for $\mathcal G$’s dual dag $\widehat{\mathcal{G}}$. $\widehat{\mathcal{G}}$ is obtained from $\mathcal G$ by reversing all of $\mathcal G$’s arcs (e.g., the evolving mesh and reduction-mesh in Fig. 1(a) are dual to each other, as are the expansion-tree and reduction-tree in Fig. 1(b)).

3.2 Finding High-Quality Schedules

A. Schedules with High ICO Quality. The stringent demands of IC-optimality—the maximum number of eligible nodes at every step of a dag-execution; cf. Eq. 1—raises the specter that ICO schedules exist only for a very constrained class of dags. Our first goal was to refute this possibility. We derived the following results.

(1) ICO schedules for specific families of dag s and computations. In [6, 21, 22], we developed ICO scheduling strategies for many familiar classes of dags, including

evolving meshes and reduction-meshes; see Fig. 1(a)
expansion-trees and reduction-trees; see Fig. 1(b)
butterfly-structured, convolutional dag s; see Fig. 1(c, right).

In [5], we expanded the abstract, dag-oriented, perspective of the preceding sources, to develop ICO scheduling strategies for many familiar classes of computations, including

convolutions—e.g., the Fast Fourier Transform, polynomial multiplication
expansion-reductions—e.g., numerical integration, comparator-based sorting
many “named” computations—e.g., Discrete Laplace Transform, matrix multiply.

(2) ICO schedules via dag decomposition. Careful analysis of our ad hoc schedules enabled us, in [19], to develop efficient—i.e., quadratic-time—algorithms that produce ICO schedules for a broad range of dags, based on structural decomposition. When the strategy succeeds in decomposing a dag $\mathcal G$ in the prescribed manner, one can “read off” an ICO schedule for $\mathcal G$ from the decomposition. The strategy has two major steps.

Step 1. Select a set of bipartite ^{Footnote 3} “building-block” dags that admit ICO schedules.

The chosen “building blocks” will be the atomic computations in the schedule. The sample repertoire in Fig. 2 fits both needs that are salient for our strategy. (a) The illustrated dags are reminiscent of pieces of the interchore dependency-dags for a broad range of significant computations. (b) These dags admit ICO schedules. Indeed, any schedule for these dag s that executes all sources sequentially is an ICO schedule.

Step 2. Establish $\rhd $-priorities among the building-block dags.

For $i = 1,2$, let dag $\mathcal G_i$ admit an IC-optimal schedule $\varSigma _i$. We say that $\mathcal G_1$ has $\rhd $-priority over $\mathcal G_2$—denoted —precisely when the following recipe produces an ICO schedule for executing both $\mathcal G_1$ and $\mathcal G_2$ (i.e., for executing the sum of $\mathcal G_1$ and $\mathcal G_2$):

First: Execute $\mathcal G_1$ by following schedule $\varSigma _1$.

Then: Execute $\mathcal G_2$ by following schedule $\varSigma _2$.

One verifies that relation $\rhd $ is transitive and efficiently tested [7].

The next ingredient in our strategy focuses on creating complex computation-dags by composing simple computation-dags. One composes dag $\mathcal G_1$ with dag $\mathcal G_2$ by merging/identifying some k sources of $\mathcal G_2$ with some k sinks of $\mathcal G_1$: the resulting dag is composite of type $\mathcal G_1 \Uparrow \mathcal G_2$. (Easily, dag-composition composes the function specified by $\mathcal G_1$ with the one specified by $\mathcal G_2$.) The following sample composition illustrates dag-composition and its associativity.

We can now announce the major contribution of our decomposition-based strategy.

Theorem 2

([19]) Focus on a dag $\mathcal G$ that is composite of type $\mathcal G_1 \Uparrow \mathcal G_2 \Uparrow \cdots \Uparrow \mathcal G_n$. Say that

— each dag $\mathcal G_i$ admits the IC-optimal schedule $\varSigma _i$;

— $\mathcal G_1 \rhd \mathcal G_2 \rhd \cdots \rhd \mathcal G_n$.

Then, the following schedule for $\mathcal G$ is IC optimal:

Use the schedules $\{\varSigma _i\}$ to execute the dag s seriatim, in order of $\rhd $-priority.

Efficient algorithms implement Theorem 2 on a large variety of “well-structured” dags. In particular, the two core processes in the theorem are computationally efficient:

— “parsing” dag $\mathcal G$ into $\mathcal G_1, \ldots , \mathcal G_n$ (when such a parsing exists)

— testing $\rhd $-priorities among the $\mathcal G_i$.

Two clarifications will help illuminate Theorem 2.

1.
A dag can have very nonlinear structure, even though it is composed from small dag s that obey a linear chain of $\rhd $-priorities. Butterfly dags provide an example. Every butterfly dag $\mathcal B$ is composed from many copies of the bipartite butterfly dag $\mathcal B_2$: symbolically, $\mathcal B$ is composite of type $\mathcal B_2 \Uparrow \mathcal B_2 \Uparrow \cdots \Uparrow \mathcal B_2$ (see Fig. 1(c)). One verifies easily that $\mathcal B_2$ has “self $\rhd $-priority”—i.e., $\mathcal B_2 \rhd \mathcal B_2$—so that $\mathcal B$ admits a linear chain of $\rhd $-priorities: $\mathcal B_2 \rhd \cdots \rhd \mathcal B_2$.
2.
Many dag s that admit ICO schedules are quite nonuniform in a graph-structure sense:
The “well-structuredness” exploited in Theorem 2 is algebraic in nature, in terms of composition and $\rhd $-priority.

(3) A weakness in the IC-scheduling paradigm. Using Theorem 2 and ad hoc techniques, we developed ICO—i.e., optimal eligible-node-enhancing—schedules for many popular families of dags, including “butterflies,” “meshes,” “trees.” But, with little difficulty, we also discovered “cousins” of these “well-structured” dags that do not admit any ICO schedule [19]. This deficiency in the IC-scheduling paradigm—the existence of unoptimizable schedules—led us to seek “weakened” versions of the paradigm that would algorithmically produce schedules for every input dag, that were optimizable according to a quality metric that correlated with computational performance. We discovered two such paradigms.

1.
A batched notion of ICO quality is introduced in [17]. The underlying idea is to execute a dag by choosing successive subsets of the then-eligible nodes.
2.
An averaged notion of ICO quality underlies the Area quality metric of Sect. 3.1.B and [3]. The underlying quest is for schedules that maximize the average number of nodes that are eligible at each step of a dag-execution.

Both the batched-ICO and Area quality measures admit optimal schedules for every dag—but the general versions of both optimization problems are NP-Complete [17, 20]. In the case of the Area measure, we were able to craft two readily computable associated heuristics (ao and sidney) that are (empirically) computationally beneficial—as discussed at length in Sect. 3.2.B. Regrettably, we have not yet succeeded in finding such an associated heuristic for the batched version of the IC-scheduling paradigm. We leave the attractive challenges related to the batched paradigm to the interested reader.

B. Schedules with High AREA Quality. In contrast to the IC-scheduling paradigm, our major accomplishments with Area-oriented scheduling involved heuristics inspired by the paradigm. We begin our discussion with theoretical developments.

(1) A-O schedulers for specific dag-families. In [3], we developed A-O schedulers for several classes of dags, including

monotonic tree- dag s: each dag is either an expansion-tree—a dag having one source, in which each nonsource has one parent—or the dual of an expansion-tree.
expansion-reduction dag s: each dag is obtained by composing a k-sink expansion-tree with a k-source reduction-tree. (Imagine, e.g., that we match up the sources of the righthand tree in Fig. 1(b) with the sinks/leaves of the lefthand tree.)
compositions of bipartite cycle- and clique- dag s. (The “building-block” dags of Fig. 2(bottom) exemplify the cycles and cliques; the butterfly-dag of Fig. 1(c) exemplifies the end product.)

Among the family-specific A-O schedulers that we developed, one stands out for its dag-scheduling consequences. This is the efficient algorithm developed in [8], that produces A-O schedules for series-parallel dag s (SP-dags, for short). (SP-dags have a rich history in the design of logic circuits. More recently, they have been used to model multi-threaded parallel computations; cf. [1].) This algorithm decomposes an input SP-dag $\mathcal G$ according to the following recursive recipe for generating SP-dags, and then it “reads off” an A-O schedule from the resulting “parse” of $\mathcal G$.

A (2-terminal) series-parallel dag $\mathcal G$(SP-dag) is produced by a sequence of the following operations.

1.
Create. Form a dag $\mathcal G$ that has:
1. (a.)
  two nodes, a source s and a target t, which are jointly $\mathcal G$’s terminals,
2. (b.)
  one arc, $(s \rightarrow t)$, directed from s to t.
2.
Compose. SP-dags, $\mathcal G'$ with terminals $s'$, $t'$, and $\mathcal G''$, with terminals $s''$, $t''$.
1. (a.)
  Parallel composition. Form the SP-dag $\mathcal G= \mathcal G' \Uparrow \mathcal G''$ from $\mathcal G'$ and $\mathcal G''$ by merging $s'$ with $s''$ to form a new source s and $t'$ with $t''$ to form a new target t.
2. (b.)
  Series composition. Form the SP-dag $\mathcal G= (\mathcal G' \rightarrow \mathcal G'')$ from $\mathcal G'$ and $\mathcal G''$ by merging $t'$ with $s''$. $\mathcal G$ has the single source $s'$ and the single target $t''$.

(2) The NP-completeness of AREA-maximization. After we developed the ICO schedules for specific dag-families discussed in Sect. 3.2.A(1), we were able to detect commonalities in reasoning that ultimately culminated in a proof for Theorem 2. In contrast, we found that our A-O schedules for specific dag-families discussed in Sect. 3.2.B(1) relied in a fundamental way on the specific structures of the specific dag-families. We were, therefore, not surprised to learn, in [20], that the general problem of computing A-O schedules is NP-complete. The proof in [20] reduces the 0–1 Minimum Weighted-Completion-Time Problem for bipartite dags, which is known to be NP-complete [25], to the A-O scheduling problem. This result shifted our focus entirely to the development of scheduling heuristics that (empirically) produced schedules with large Areas. We now describe the main heuristics that we have developed.

(3) Area-oriented scheduling heuristics. We have developed three scheduling heuristics that are “Area-centric,” in the sense that they exploit Area-related structural properties of the dag being scheduled.

(a) Heuristic d-g [3]. The dynamic-greedy scheduling heuristic d-g crafts a schedule for a dag $\mathcal G$ by organizing $\mathcal G$’s eligible chores in a list structure that is (partially) ordered by chores’ yields, with ties broken randomly. The yield v(t) of eligible chore v at step t is the number of non-eligible chores that would be rendered eligible if v were executed now. The yield of a chore u can change at each step, and the execution of u can change the yields of many other chores, specifically, those that share children with u. Thus, in contrast with our other schedulers, the schedules produced by d-g change at each step—which gives d-g time-complexity commensurate with our other heuristics.

Note. d-g’s successive choices of the next node to execute are locally optimal—except for its (nonexistent) tie-breaking mechanism.

(b) Heuristic ao [8]. The Area-oriented scheduling heuristic ao builds on two facts: (i) We have access to an efficient A-O scheduler for SP-dags; cf. Section 3.2.B(1). (ii) Every dag $\mathcal G$ can be transformed efficiently to an SP-dag $\sigma (\mathcal G)$ that retains both $\mathcal G$’s inter-chore dependencies and (roughly) its degree of inherent parallelism. Several sources describe “SP-izing” transformations; a perspicuous version from [12] is invoked in [8]. Heuristic ao produces a high-Area schedule for a dag $\mathcal G$ in three steps.

Step 1. Transform $\mathcal G$ to an SP-dag $\sigma (\mathcal G)$, using an algorithm from [12].

Step 2. Produce an A-O schedule $\widetilde{\varSigma }$ for $\sigma (\mathcal G)$, via the algorithm in [8].

Step 3. “Filter” schedule $\widetilde{\varSigma }$ to remove the “auxiliary” nodes added when SP-izing $\mathcal G$.

(c) Heuristic sidney . The sidney scheduling heuristic of [20] inherits both its name and its algorithmic underpinnings from a sophisticated dag-decomposition scheme from [23]. It schedules an input dag $\mathcal G$ in four steps.

Step 1. Transform $\mathcal G$ to its associated 0–1 version $\mathcal G_{0,1}$.

The nodes of $\mathcal G_{0,1}$ are obtained by splitting each node v of $\mathcal G$ into two nodes, $v_0$ and $v_1$. Give each node of $\mathcal G_{0,1}$ that has a 0 subscript (the 0 nodes) a processing time of 0 and a weight of 1; give each node of $\mathcal G_{0,1}$ that has a 1 subscript (the 1-nodes) a processing time of 1 and a weight of 0. Finally, give $\mathcal G_{0,1}$ an arc $(u_1 \rightarrow v_0)$ for each arc $(u \rightarrow v)$ of $\mathcal G$ and an arc $(u_0 \rightarrow u_1)$ for each node u of $\mathcal G$.

Step 2. Use a max-flow computation to perform a Sidney decomposition of $\mathcal G_{0,1}$, via the algorithm in [23].

Step 3. Say that the decomposition of $\mathcal G_{0,1}$ produces dags $\mathcal G_1, \ldots , \mathcal G_k$.

a. Remove all 0-nodes from every $\mathcal G_i$.

b. Use heuristic d-g to produce a schedule $\varSigma _i$ for each $\mathcal G_i$.

Step 4. Output schedule , the concatenation of the k subschedules.

At the cost of somewhat more computation than needed for heuristics d-g and ao, sidney empirically produces schedules whose Areas are within 85 % of maximal [20].

3.3 The Benefits of Opportunistic Scheduling

A. Benefits Exposed via Simulation Experiments. Simulation-based studies of the opportunistic scheduling paradigms we have discussed appear in [3, 4, 13, 16, 24]. Rather than reproduce material that appears in great detail in those sources, I have decided to summarize here the major messages of those studies.

B1. One observes in all of the cited sources that there are two circumstances under which all (oblivious) scheduling paradigms are essentially equivalent in performance.

(a) When computing resources are plentiful, then the inherently sequential critical path of a dag is the only constraint on the speed of executing the dag.

(b) When computing resources are really meager, then there are no opportunities for efficiency-enhancing concurrency.

B2. One observes in [13, 16] situations where ICO schedules outperform a variety of platform-oblivious competitor-schedules by as much a 10–20%. The tested workloads in [16] were real scientific computations; the ones in [13] were synthetic, but with structures that approximated those of real scientific computations.

B3. Since the Area quality-metric is a weakening of IC-quality, it is not surprising that the benefits of A-O schedules observed in [3] are more modest than those observed in the IC-frameworks of [13, 16]. That said, one still often observes A-O schedules outperforming a variety of platform-oblivious competitor-schedules by double-digit percentages. Indeed, the same type of performance is observed even in [4] with heuristic ao. One observes that A-O schedules outperform those produced by heuristic ao, but only by percentages that may not justify the computational cost of producing the A-O schedule.

B4. The experiments in [4] suggest that the schedules produced by heuristic ao perform best when computing resources become available according to distributions that have low variances.

B5. The experimental settings in [3, 4] (involving, respectively, A-O schedules and schedules produced by heuristic ao) posit that computing resources become available according to low-variance distributions. It is observed experimentally in [4] that, within such settings, the Areas of generated schedules inversely track the makespans of the schedules’ dag-executions—i.e., larger Areas correlate with smaller makespans.

B6. In contrast to heuristic ao, the very high-Area schedules produced by heuristic sidney seem to favor situations wherein the distributions governing computing-resource availability do not have low variances [20]. This suggests that sidney’s schedules may be desirable in settings such as enterprise clouds, where the user can tailor the purchase of computing resources based on the varying numbers of eligible nodes produced over time by one’s dag-schedule.

B7. The experiments reported in [24] seem to validate Observations 4 and 6: schedules produced by heuristic ao are observed to perform very well in “single-instance” enterprise clouds, wherein there is a single block of computing resources that are available at any moment. In fact, the static heuristic ao i sobserved to compete well with dynamic competitor schedules.

B. Two Major Open Issues. We close with a two open issues regarding opportunistic dag-scheduling. The benefits we have already uncovered—and enumerated in this section—explain our belief in the potential significance of success in addressing these issues.

Q1. The discovery in [19] that many dags do not admit ICO schedules led to three weakened version of IC-scheduling: batched IC-scheduling [18], a version based on weakening the $\rhd $-priority relation of Theorem 2 [16], and Area-oriented scheduling [3]. Of these alternatives, only Area-oriented scheduling has been studied in any detail. The other alternatives certainly deserve more attention than they have received.

Q2. Our study of opportunistic dag-scheduling began with a focus on dynamically heterogeneous computing platforms—and it has largely retained that focus. The benefits of eligible-node-enhancing dag-schedules should be significant also in other domains:

(a) Opportunistic dag-schedulers may be valuable when pursuing cost-effective computing within an enterprise cloud. Having access to large numbers of eligible nodes should alow a user to maximally exploit available cost-effective resources. This benefit is hinted at in [24], but it deserves careful study.

(b) In a similar vein, opportunistic dag-schedulers may be beneficial in power-aware computing environments. Their schedules may enable one to maximally exploit low-power resources as they become available. This possibility, too, deserves careful study.

Notes

1.
We use the granularity-neutral “chore” for the units that form the computation.
2.
[a..b] denotes the set of integers $\{ a, a+1, \ldots , b \}$.
3.
A bipartite dag’s nodes are partitioned into sets X and Y, with every arc going from X to Y.

References

Blumofe, R.D., Joerg, C.F., Kuszmaul, B.C., Leiserson, C.E., Randall, K.H., Zhou, Y.: Cilk: an efficient multithreaded runtime system. In: 5th ACM SIGPLAN Symposium on Principles and Practices of Parallel Programming (PPoPP 1995) (1995)
Google Scholar
Casanova, H., Dufossé, F., Robert, Y., Vivien, F.: Scheduling parallel iterative applications on volatile resources. In: 25th IEEE International Parallel and Distributed Processing Symposium (2011)
Google Scholar
Cordasco, G., De Chiara, R., Rosenberg, A.L.: On scheduling DAGs for volatile computing platforms: area-maximizing schedules. J. Parallel Distrib. Comput. 72(10), 1347–1360 (2012)
Article MATH Google Scholar
Cordasco, G., De Chiara, R., Rosenberg, A.L.: An AREA-oriented heuristic for scheduling DAGs on volatile computing platforms. IEEE Trans. Parallel Distrib. Syst. 26(8), 2164–2177 (2015)
Article Google Scholar
Cordasco, G., Malewicz, G., Rosenberg, A.L.: Applying IC-scheduling theory to some familiar classes of computations. In: Workshop on Large-Scale, Volatile Desktop Grids (PCGrid 2007) (2007)
Google Scholar
Cordasco, G., Malewicz, G., Rosenberg, A.L.: Advances in IC-scheduling theory: scheduling expansive and reductive DAGs and scheduling DAGs via duality. IEEE Trans. Parallel Distrib. Syst. 18, 1607–1617 (2007)
Article Google Scholar
Cordasco, G., Malewicz, G., Rosenberg, A.L.: Extending IC-scheduling via the Sweep algorithm. J. Parallel Distrib. Comput. 70, 201–211 (2010)
Article MATH Google Scholar
Cordasco, G., Rosenberg, A.L.: On scheduling series-parallel DAGs to maximize AREA. Int. J. Found. Comput. Sci. 25(5), 597–621 (2014)
Article MathSciNet MATH Google Scholar
Cordasco, G., Rosenberg, A.L., Sims, M.: On clustering DAGs for task-hungry computing platforms. Cent. Eur. J. Comput. Sci. 1, 19–35 (2011)
MATH Google Scholar
Cormen, T.H., Leiserson, C.E., Rivest, R.L., Stein, C.: Introduction to Algorithms, 2nd edn. MIT Press, Cambridge (1999)
MATH Google Scholar
Estrada, T., Taufer, M., Reed, K.: Modeling job lifespan delays in volunteer computing projects. In: 9th IEEE International Symposium on Cluster, Cloud, and Grid Computing (CCGrid) (2009)
Google Scholar
González-Escribano, A., van Gemund, A.J.C., Cardeñoso-Payo, V.: Mapping unstructured applications into nested parallelism. In: Palma, J.M.L.M., Sousa, A.A., Dongarra, J., Hernández, V. (eds.) VECPAR 2002. LNCS, vol. 2565, pp. 407–420. Springer, Heidelberg (2003)
Chapter Google Scholar
Hall, R., Rosenberg, A.L., Venkataramani, A.: A comparison of DAG-scheduling strategies for internet-based computing. In: 21st IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2007)
Google Scholar
Kale, L.V., Bhatele, A. (eds.): Parallel Science and Engineering Applications: The Charm++ Approach. New York, Taylor & Francis Group, CRC Press (2013)
Google Scholar
Korpela, E., Werthimer, D., Anderson, D., Cobb, J., Lebofsky, M.: SETI@home: massively distributed computing for SETI. In: Dubois, P.F (ed.) Computing in Science and Engineering. IEEE Computer Society Press (2000)
Google Scholar
Malewicz, G., Foster, I., Rosenberg, A.L., Wilde, M.: A tool for prioritizing DAGMan jobs and its evaluation. J. Grid Comput. 5, 197–212 (2007)
Article Google Scholar
Malewicz, G., Rosenberg, A.L.: Batch-scheduling dags for internet-based computing. In: Cunha, J.C., Medeiros, P.D. (eds.) Euro-Par 2005. LNCS, vol. 3648, pp. 262–271. Springer, Heidelberg (2005)
Chapter Google Scholar
Malewicz, G., Rosenberg, A.L.: A pebble game for internet-based computing. In: Goldreich, O., Rosenberg, A.L., Selman, A.L. (eds.) Theoretical Computer Science. LNCS, vol. 3895, pp. 291–312. Springer, Heidelberg (2006)
Chapter Google Scholar
Malewicz, G., Rosenberg, A.L., Yurkewych, M.: Toward a theory for scheduling DAGs in internet-based computing. IEEE Trans. Comput. 55, 757–768 (2006)
Article Google Scholar
Roche, S.T., Rosenberg, A.L., Rajaraman, R.: On constructing DAG-schedules with large AREAs. Concurrency Comput. Pract. Experience 27(16), 4107–4121 (2015)
Article Google Scholar
Rosenberg, A.L.: On scheduling mesh-structured computations for internet-based computing. IEEE Trans. Comput. 53, 1176–1186 (2004)
Article Google Scholar
Rosenberg, A.L., Yurkewych, M.: Guidelines for scheduling some common computation-DAGs for internet-based computing. IEEE Trans. Comput. 54, 428–438 (2005)
Article Google Scholar
Sidney, J.B.: Decomposition algorithms for single-machine sequencing with precedence relations and deferral costs. Oper. Res. 23(2), 283–298 (1975)
Article MathSciNet MATH Google Scholar
Taufer, M., Rosenberg, A.L.: Scheduling DAG-based workflows on single cloud instances: high performance and cost effectiveness with a static scheduler. Int. J. High Perform. Comput. Appl. (2015). doi:10.1177/1094342015594518
Google Scholar
Woeginger, G.J.: On the approximability of average completion time scheduling under precedence constraints. Discrete Appl. Math. 131(1), 237–252 (2003)
Article MathSciNet MATH Google Scholar
Yao, S., Lee, H.-H.S.: Using mathematical modeling in provisioning a heterogeneous cloud computing environment. IEEE Comput. 44, 55–62 (2011)
Google Scholar
Zaharia, M., Konwinski, A., Joseph, A.D., Katz, R., Stoica, I.: Improving MapReduce performance in heterogeneous environments. In: 7th USENIX Symposium on Operating System Design and Implementation (2008)
Google Scholar

Download references

Acknowledgments

It is a pleasure to acknowledge the invaluable contributions of my collaborators on the work discussed here: Gennaro Cordasco, Rosario De Chiara, Ian Foster, Robert Hall, Greg Malewicz, Rajmohan Rajaraman, Scott Roche, Mark Sims, Michela Taufer, Arun Venkataramani, Mike Wilde, Matt Yurkewych. Our work on opportunistic dag-scheduling has been supported in part by several grants from the US National Science Foundation, most recently Grant CSR-1217981.

Author information

Authors and Affiliations

Computer Science, Northeastern University, Boston, MA, USA
Arnold L. Rosenberg

Authors

Arnold L. Rosenberg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Arnold L. Rosenberg .

Editor information

Editors and Affiliations

Université Grenoble-Alpes , Grenoble, France
Pierre-François Dutot
Université Grenoble Alpes , Grenoble, France
Denis Trystram

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rosenberg, A.L. (2016). Scheduling DAGs Opportunistically: The Dream and the Reality Circa 2016. In: Dutot, PF., Trystram, D. (eds) Euro-Par 2016: Parallel Processing. Euro-Par 2016. Lecture Notes in Computer Science(), vol 9833. Springer, Cham. https://doi.org/10.1007/978-3-319-43659-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-43659-3_2
Published: 09 August 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-43658-6
Online ISBN: 978-3-319-43659-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics