Skip to main content
Log in

Speed scaling problems with memory/cache consideration

  • Published:
Journal of Scheduling Aims and scope Submit manuscript

Abstract

Speed scaling problems consider energy-efficient job scheduling in processors by adjusting the speed to reduce energy consumption, where power consumption is a convex function of speed (usually, \(P(s) =s^{\alpha }, \alpha =2,3\)). In this work, we study speed scaling problems considering memory/cache. Each job needs some time for memory operation when it is fetched from memory,, and needs less time if fetched from the cache. The objective is to minimize energy consumption while satisfying the time constraints of the jobs. Two models are investigated, the non-cache model and the with-cache model. The non-cache model is a variant of the ideal model, where each job i needs a fixed \(c_i\) time for its memory operation; the with-cache model further considers the cache, a memory device with much faster access time but limited space. The uniform with-cache model is a special case of the with-cache model in which all \(c_i\) values are the same. We provide an \(O(n^3)\) time algorithm and an improved \(O(n^2\log n)\) time algorithm to compute the optimal solution in the non-cache model. For the with-cache model, we prove that it is NP-complete to compute the optimal solution. For the uniform with-cache model with agreeable jobs (later-released jobs do not have earlier deadlines), we derive an \(O(n^4)\) time algorithm to compute the optimal schedule, while for the general case we propose a \((2\alpha \frac{g}{g-1})^{\alpha }/2\)-approximation algorithm in a resource augmentation setting in which the memory operation time can accelerate by at most g times.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Albers, S. (2010). Energy-efficient algorithms. Communications of the ACM, 53(5), 86–96.

    Article  Google Scholar 

  • Albers, S., & Antoniadis, A. (2014). Race to idle: New algorithms for speed scaling with a sleep state. ACM Transactions on Algorithms (TALG), 10(2), 9.

    Google Scholar 

  • Albers, S., Antoniadis, A., & Greiner, G. (2015). On multi-processor speed scaling with migration. Journal of Computer and System Sciences, 81(7), 1194–1209.

    Article  Google Scholar 

  • Antoniadis, A., Huang, C. C., & Ott, S. (2015). A fully polynomial-time approximation scheme for speed scaling with sleep state. In Proceedings of the twenty-sixth annual ACM-SIAM symposium on discrete algorithms (pp. 1102–1113).

    Google Scholar 

  • Aydin, H., Devadas, V., & Zhu, D. (2006). System-level energy management for periodic real-time tasks. In Proceedings of the 27th IEEE real-time systems symposium (pp. 313–322).

  • Bambagini, M., Marinoni, M., Aydin, H., & Buttazzo, G. (2016). Energy-aware scheduling for real-time systems: A survey. ACM Transactions on Embedded Computing Systems (TECS), 15(1), 7.

    Article  Google Scholar 

  • Bansal, N., Bunde, D. P., Chan, H. L., & Pruhs, K. (2008). Average rate speed scaling. In Proceedings of the 8th Latin American theoretical informatics symposium, volume 4957 of LNCS (pp. 240–251).

  • Bansal, N., Kimbrel, T., & Pruhs, K. (2004). Dynamic speed scaling to manage energy and temperature. In Proceedings of the 45th annual symposium on foundations of computer science (pp. 520–529).

  • Baptiste, P. (1999). An \(O(n^4)\) algorithm for preemptive scheduling of a single machine to minimize the number of late jobs. Operations Research Letters, 24(4), 175–180.

    Article  Google Scholar 

  • Bini, E., Buttazzo, G., & Lipari, G. (2005). Speed modulation in energy-aware real-time systems. In IEEE proceedings of the 17th Euromicro conference on real-time systems (pp. 3–10).

  • Choi, K., Soma, R., & Pedram, M. (2005). Fine-grained dynamic voltage and frequency scaling for precise energy and performance trade-off based on the ratio of off-chip access to on-chip computation times. IEEE Transactions on Computer Aided Design of Integrated Circuits and Systems, 24(1), 18–28.

    Article  Google Scholar 

  • Hong, I., Qu, G., Potkonjak, M., & Srivastavas, M. B. (1998). Synthesis techniques for low-power hard real-time systems on variable voltage processors. In Proceedings of the IEEE real-time systems symposium (pp. 178–187).

  • Hsu, C. H., & Feng, W. C. (2004). Effective dynamic voltage scaling through CPU-boundedness detection. In the 4th IEEE/ACM workshop on power-aware computing systems (pp. 135–149).

    Chapter  Google Scholar 

  • Irani, S., Shukla, S., & Gupta, R. K. (2007). Algorithms for power savings. Journal ACM Transactions on Algorithms, 3(4), 41.

    Article  Google Scholar 

  • Ishihara, T., & Yasuura, H. (1998). Voltage scheduling problem for dynamically variable voltage processors. In Proceedings. 1998 International Symposium on low power electronics and design, 1998. IEEE.

  • Li, M., & Yao, F. (2005). An efficient algorithm for computing optimal discrete voltage schedules. SIAM Journal on Computing, 35(3), 658–671.

    Article  Google Scholar 

  • Seth, K., Anantaraman, A., Mueller, F., & Rotenberg, E. (2003). Fast: Frequency-aware static timing analysis. In Proceedings of the 24th IEEE real-time system symposium (pp. 40–51).

  • Wu, W., Li, M., & Chen, E. (2009). Min-energy scheduling for aligned jobs in accelerate model. In Proceedings of 20th international symposium on algorithms and computation (ISAAC 09) (pp. 462–472).

    Chapter  Google Scholar 

  • Yang, C. Y., Chen, J. J., & Kuo, T. W. (2007). Preemption control for energy-efficient task scheduling in systems with a DVS processor and non-DVS devices. In Proceedings of the 13th IEEE international conference on embedded and real-time computing systems and applications (pp. 293–300).

  • Yao, F., Demers, A., & Shenker, S. (1995). A scheduling model for reduced CPU energy. In Proceedings of IEEE symposium on foundations of computer science (FOCS) (pp. 374–382).

Download references

Acknowledgements

This work is supported in part by the National Natural Science Foundation of China under Grants No. 61727809, No. 61572342, No. 61672154, Research Grants Council of the Hong Kong Special Administrative Region, China (Project No. CityU11268616), Natural Science Foundation of Jiangsu Province under Grant No. BK20151240.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Weiwei Wu.

Appendices

Appendix A: Proof of Lemma 1

Proof

Property (P1) can be proved by standard swapping techniques. For example, for two jobs with deadlines \(d_{j_1}\le d_{j_2}\), suppose on the contrary that in a schedule these two jobs finish at time \(t_{j_2}\), \(t_{j_1}\) with \(t_{j_2}< t_{j_1}\). Clearly, \(t_{j_2}<t_{j_1}\le d_{j_1}\le d_{j_2}\) by the feasibility of the schedule. We can then obtain a new schedule by swapping the execution order of \(j_1,j_2\) while keeping the speed function unchanged. The new schedule would make \(j_1\) finish earlier than \(j_2\) and \(j_2\) finish at time \(t_{j_1}\) with \(t_{j_1}\le d_{j_2}\). Thus, the deadline constraints of these two jobs are not violated. Moreover, the energy consumption stays the same, since the speed function is not changed. Thus, applying such transformations, we can transform any feasible schedule into EDF order.

We now prove (P2) by contradiction. Suppose on the contrary that job i is executed with more than two speeds in the optimal schedule. We choose two small intervals (with the same length \(\epsilon \rightarrow 0\)) that are used to execute job i with speeds \(s_1>s_2>0\). We could move workload \(\frac{s_1-s_2}{2}\epsilon \) of i from the interval with speed \(s_1\) to the interval with speed \(s_2\) so that the two intervals will have the same speed \(\frac{s_1+s_2}{2}\). According to the convexity of the power function, this deceases the energy consumption and does not violate the feasibility of the schedule, which leads to a contradiction. Therefore, in the optimal solution, every job has a unique speed in the optimal solution. Accordingly, the optimal solution is composed of blocks.

Now we focus on property (P3). Assume that OPT has speed \(s_1\) in [a, b] which is the maximum speed in OPT. Let the job that is executed (or virtually executed) at the end of block [a, b] be \(j_1\). Assuming on the contrary that b is not a tight deadline, we have \(d_{j_1}>b\). We will select a small interval \([a_0,a_0+\epsilon ']\) with length \(\epsilon '\rightarrow 0\) as defined below. If the end of block [a, b] is executing a virtual workload, then we set \([a_0,a_0+\epsilon ']\) to be the latest interval that executes \(j_1\)’s workload (not virtual) with speed \(s_1\) (this small interval exists because \(j_1\) has virtual speed \(s_1\)). Otherwise, we set \(a_0+\epsilon '=b\).

We will discuss two cases of interval \([b,d_{j_1}]\). If OPT has a small interval \([a',b']\) with \(b\le a'<b'\le d_{j_1}\) that executes some workload with speed \(s_2<s_1\), then moving a partial workload of \(j_1\) in \([a_0,a_0+\epsilon ']\) to \([a',a'+\epsilon ']\) (resulting in the same speed in these two blocks) reduces the energy by Fact 1. The resulting schedule is feasible, which contradicts the optimality of OPT. Consider the remaining case that the whole interval \([b,d_{j_1}]\) is used for eviction or jobs executed in interval \([b,d_{j_1}]\) have speed \(s_1\). Then OPT must allocate an eviction interval [b, v] with virtual speed \(s_2<s_1\). Such [b, v] exists, because otherwise the peak block is [a, v] instead of [a, b]. Suppose that OPT allocates job \(j_2\)’s eviction time at interval \([v-\epsilon ,v]\) with \(\epsilon <\epsilon '\). According to the definition of virtual speed, \(j_2\)’s workload is executed at speed \(s_2\) in some interval which should be outside \([a,d_{j_1}]\), since we have shown that there are no other jobs that are executed in interval \([a,d_{j_1}]\) at a speed lower than \(s_1\). W.l.o.g., assume that interval \([a',b']\) with \(d_{j_1}\le a'<b'\le d_{j_2}\) is used to execute \(j_2\)’s workload. We will prove by contradiction that if b is not a tight deadline, we can obtain another transformed schedule with less energy consumption. The transformation attempts to change the original two chosen intervals with speeds \(s_1,s_2\) to three intervals with speed \(s_3,s_3,s_4\), which follows \(s_1>s_3>s_4>s_2\), so that the energy consumption is reduced by the convexity of the power function. The details are as follows. We focus on one virtual interval \([v-\epsilon , v]\) and two intervals \([a_0,a_0+\epsilon ']\) and \([a',b']\). We then apply the following transformation procedure:

  1. 1.

    Move all workloads in \([a',a'+\epsilon ]\) to \([a'+\epsilon ,b']\).

  2. 2.

    Swap job \(j_2's\) eviction interval \([v-\epsilon ,v]\) with the vacant interval \([a',a'+\epsilon ]\).

  3. 3.

    Move partial workload of \([a_0, a_0+\epsilon ']\) to the vacant interval \([v-\epsilon ,v]\) so that these two blocks have the same speed (let the speed be \(s_3\)). Ensure that the speed \(s_3\) in \([a_0,a_0+\epsilon '],[v-\epsilon ,v]\) is larger than the speed (\(s_4\)) in \([a',a'+\epsilon ]\) by selecting a small \(\epsilon \).

After the transformation, we obtain one virtual interval \([a',a'+\epsilon ]\) and the following three intervals: \([a_0,a_0+\epsilon ']\) with speed \(s_3\), \([v-\epsilon , v]\) with speed \(s_3\), and \([a'+\epsilon , b']\) with speed \(s_4\). The total energy in these intervals after the transformation is \(s_3^{\alpha }\cdot (\epsilon '+\epsilon ) +s_4^{\alpha }(b'-a'-\epsilon )\). Obviously, the resulting schedule is still feasible for all jobs after transformation. The total energy in intervals \([a_0,a_0+\epsilon ']\) and \([a',b']\) before the transformation is \(s_1^{\alpha }\cdot \epsilon ' +s_2^{\alpha }(b'-a')\). Through such a chain-like transformation procedure, we can see that the energy is reduced by the property \(s_1>s_3>s_4>s_2\) and the convexity of the power function. Thus, all cases lead to a contradiction. Therefore, b is a tight deadline. Symmetrically, we can prove that a is a tight arrival time using similar deduction. This proves property (P3). \(\square \)

Appendix B: Algorithm of optimal discrete voltage schedule in Section 3.2

figure e

Theorem 9

In Algorithm 5, the algorithm maintains the following properties at the beginning of iteration i:

  1. (1)

    \(Committed(i+1) \subseteq I^{s_1}_{i+1} \cup I^{s_2}_{i+1}\)

  2. (2)

    \(\bigcup ^n_{k=i+1} I^{s_2}_k \subseteq Committed \subseteq (\bigcup ^n_{k=i+1} I^{s_1}_k) \cup (\bigcup ^n_{k=i+1} I^{s_2}_k)\)

  3. (3)

    \(Committed \cap (\bigcup ^i_{k=1} I^{s_1}_k) = \emptyset \)

  4. (4)

    \(Committed \cap I^{s_1}_i = \emptyset \).

Proof

In iteration i, time intervals assigned to job \(J_i\) include all time intervals of \(I^{s_2}_i\) and some time intervals (explained in more detail later) from \(I^{s_1}_i\); therefore, (1) and (2) are correct. For (3), first, \((\bigcup ^i_{k=1} I^{s_1}_k)\) is disjoint with \((\bigcup ^n_{k=i+1} I^{s_1}_k)\), and \((\bigcup ^i_{k=1} I^{s_2}_k)\) is disjoint with \((\bigcup ^n_{k=i+1} I^{s_2}_k)\). Second, \((\bigcup ^i_{k=1} I^{s_1}_k)\) is contained in \((\bigcup ^i_{k=1} I^{s_2}_k)\) (which can be proved by induction; one may refer to Li and Yao (2005) for additional detail), which results in \(\bigcup ^i_{k=1} I^{s_1}_k \cap ( \bigcup ^n_{k=i+1} I^{s_1}_k \cup \bigcup ^n_{k=i+1} I^{s_2}_k) = \emptyset \). Combined with (2), (3) is correct. (4) is obvious from (3). \(\square \)

Correctness of Algorithm 5 The high-level idea is to assign job \(J_i\) the remaining intervals from its \(s_2\)-schedule first. If the allocated time is sufficient to finish \(J_i\) at speed \(s_1\) ad \(s_2\), then we move on to the next iteration. Otherwise (the time is not sufficient even if we run the job at high speed \(s_1\) constantly), we extract some time from \(J_i\)’s \(s_1\)-schedule.

In iteration i, time intervals assigned to job \(J_i\) are from \((I^{s_2}_i \cup I^{s_1}_i) - Committed\). We first explain Step 2, and we discuss two extreme cases. In the first case, we take \(Committed(i) = I\) and use constant speed \(s_2\) for job \(J_i\). Apparently, this schedule may not be feasible, but it contains no idle time. In the second case, we take \(Committed(i) = I \cup I^{s_1}_i\) and use constant speed \(s_1\) for job \(J_i\). This schedule must be feasible for job \(J_i\) by property (4) of Theorem 9. Combined with these two cases, an \((s_1,s_2)\)-schedule must exist. Once Committed(i) is fixed, an \((s_1,s_2)\)-schedule could be obtained by solving the linear equations in Theorem 4 for job \(J_i\). We set \(Committed(i) = I\) if \(s_1 \cdot (|Committed(i)|-c_i) \ge w_i\); otherwise we set \(Committed(i) = I \cup I^{'}\) such that \(s_1 \cdot (|Committed(i)|-c_i) = w_i\), where \(I^{'} \subseteq I^{s_1}_i\) is taken from the right side of \(I^{s_1}_i\). Consequently, no idle time interval is produced by the algorithm, and we obtain a feasible \((s_1,s_2)\)-schedule for job \(J_i\).

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wu, W., Li, M., Wang, K. et al. Speed scaling problems with memory/cache consideration. J Sched 21, 633–646 (2018). https://doi.org/10.1007/s10951-018-0565-1

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10951-018-0565-1

Keywords

Navigation