Skip to main content
Log in

Capacity planning and scheduling for jobs with uncertainty in resource usage and duration

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

Organizations around the world schedule jobs (programs) regularly to perform various tasks dictated by their end users. With the major movement toward using a cloud computing infrastructure, our organization follows a hybrid approach with both cloud and on-prem servers. The objective of this work is to perform capacity planning, i.e., estimate resource requirements, and job scheduling for on-prem grid computing environments. A key contribution of our approach is handling uncertainty in both resource usage and duration of the jobs, a critical aspect in the finance industry where stochastic market conditions significantly influence job characteristics. For capacity planning and scheduling, we simultaneously balance two conflicting objectives: (a) minimize resource usage and (b) provide high quality of service to the end users by completing jobs by their requested deadlines. We propose approximate approaches using deterministic estimators and pair sampling-based constraint programming. Our best approach (pair sampling-based) achieves up to 41.6% estimated peak reduction in resource usage compared to manual scheduling without compromising on the quality of service.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16

Similar content being viewed by others

Notes

  1. Refer to Sect. 3 for definitions of D and R.

  2. Unless mentioned otherwise, the unit of time is in seconds throughout the paper.

References

  1. Azar Y, Leonardi S, Touitou N (2021) Flow time scheduling with uncertain processing time. In: Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pp 1070–1080

  2. Bao Z, Chen L, Qiu K (2022) A robust optimization approach for the resource investment problem of an aircraft final assembly line. IEEE Trans Autom Sci Eng. https://doi.org/10.1109/TASE.2022.3195540

    Article  Google Scholar 

  3. Bidot J (2005) A general framework integrating techniques for scheduling under uncertainty. Institut National Polytechnique de Toulouse, Toulouse

    Google Scholar 

  4. Bidot J, Vidal T, Laborie P et al (2009) A theoretic and practical framework for scheduling in a stochastic environment. J Sched 12(3):315–344

    Article  MathSciNet  Google Scholar 

  5. Chen H, Zhu X, Qiu D, et al (2016) Uncertainty-aware real-time workflow scheduling in the cloud. In: 2016 IEEE 9th International Conference on Cloud Computing (CLOUD), IEEE, pp 577–584

  6. Chen Z, Sim M, Xiong P (2020) Robust stochastic optimization made easy with RSOME. Manag Sci 66(8):3329–3339

    Article  Google Scholar 

  7. Creemers S (2015) Minimizing the expected Makespan of a project with stochastic activity durations under resource constraints. J Sched 18(3):263–273

    Article  MathSciNet  Google Scholar 

  8. Creemers S (2016) The preemptive stochastic resource-constrained project scheduling problem: an efficient globally optimal solution procedure. Available at SSRN 2873356

  9. Gerhards P (2020) The multi-mode resource investment problem: a benchmark library and a computational study of lower and upper bounds. Or Spectrum 42(4):901–933

    Article  MathSciNet  Google Scholar 

  10. Google (2022) Google optimization tools. https://github.com/google/or-tools

  11. Gopalakrishnan S, Borrajo D (2022) Assignment and prioritization of tasks with uncertain durations for satisfying Makespans in decentralized execution. In: Proceedings of the International Conference on Automated Planning and Scheduling, pp 119–123

  12. Habibi F, Barzinpour F, Sadjadi S (2018) Resource-constrained project scheduling problem: review of past and recent developments. J Proj Manag 3(2):55–88

    Google Scholar 

  13. Hartmann S, Briskorn D (2022) An updated survey of variants and extensions of the resource-constrained project scheduling problem. Eur J Oper Res 297(1):1–14

    Article  MathSciNet  Google Scholar 

  14. Hsu CC, Kim DS (2005) A new heuristic for the multi-mode resource investment problem. J Oper Res Soc 56(4):406–413

    Article  Google Scholar 

  15. Li Z, Yu H, Fan G (2023) Cost-effective approaches for deadline-constrained workflow scheduling in clouds. J Supercomput 79(7):7484–7512

    Article  Google Scholar 

  16. Liu J, Ren J, Dai W et al (2021) Online multi-workflow scheduling under uncertain task execution time in IaaS clouds. IEEE Trans Cloud Comput 9(03):1180–1194

    Article  Google Scholar 

  17. Ma W, Che Y, Huang H et al (2016) Resource-constrained project scheduling problem with uncertain durations and renewable resources. Int J Mach Learn Cybern 7(4):613–621

    Article  Google Scholar 

  18. Malewicz G (2005) Parallel scheduling of complex dags under uncertainty. In: Proceedings of the 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures, pp 66–75

  19. Morihara I, Ibaraki T, Hasegawa T (1983) Bin packing and multiprocessor scheduling problems with side constraint on job types. Discret Appl Math 6(2):173–191

    Article  MathSciNet  Google Scholar 

  20. Neumann K, Schwindt C, Zimmermann J (2002) Project scheduling with time windows and scarce resources: temporal and resource-constrained project scheduling with regular and nonregular objective functions, vol 508. Springer, Cham

    Google Scholar 

  21. Oddi A, Rasconi R, Cesta A (2015) A multi-objective large neighborhood search methodology for scheduling problems with energy costs. In: 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI), IEEE, pp 453–460

  22. Radhamani A, Baburaj E (2013) Performance efficient heterogeneous multi core scheduling strategy based on genetic algorithm. ARPN J Eng Appl Sci 8(1):26–32

    Google Scholar 

  23. Rostami S, Creemers S, Leus R (2018) New strategies for stochastic resource-constrained project scheduling. J Sched 21(3):349–365

    Article  MathSciNet  Google Scholar 

  24. Schutt A, Feydy T, Stuckey PJ et al (2011) Explaining the cumulative propagator. Constraints 16(3):250–282

    Article  MathSciNet  Google Scholar 

  25. Shadrokh S, Kianfar F (2007) A genetic algorithm for resource investment project scheduling problem, tardiness permitted with penalty. Eur J Oper Res 181(1):86–101

    Article  MathSciNet  Google Scholar 

  26. Song W, Kang D, Zhang J et al (2019) A sampling approach for proactive project scheduling under generalized time-dependent workability uncertainty. J Artif Intell Res 64:385–427

    Article  MathSciNet  Google Scholar 

  27. Tran TT, Padmanabhan M, Zhang PY et al (2018) Multi-stage resource-aware scheduling for data centers with heterogeneous servers. J Sched 21(2):251–267

    Article  MathSciNet  Google Scholar 

  28. Varakantham P, Fu N, Lau HC (2016) A proactive sampling approach to project scheduling under uncertainty. In: Proceedings of the AAAI Conference on Artificial Intelligence

  29. Xiong J, Liu J, Chen Y et al (2013) A knowledge-based evolutionary multi-objective approach for stochastic extended resource investment project scheduling problems. IEEE Trans Evolut Comput 18(5):742–763

    Article  Google Scholar 

  30. Yin L, Zhou J, Sun J (2022) A stochastic algorithm for scheduling bag-of-tasks applications on hybrid clouds under task duration variations. J Syst Softw 184:111123

    Article  Google Scholar 

  31. Zhou Y, Miao J, Yan B et al (2021) Stochastic resource-constrained project scheduling problem with time varying weather conditions and an improved estimation of distribution algorithm. Comput Ind Eng 157:107322

    Article  Google Scholar 

Download references

Acknowledgements

The authors would like to acknowledge Alberto Pozanco, Rui Silva, and Daniel Borrajo for their helpful suggestions and comments on this work. This paper was prepared for informational purposes in part by the Artificial Intelligence Research Group of JPMorgan Chase & Co and its affiliates ("J.P. Morgan") and is not a product of the Research Department of J.P. Morgan. J.P. Morgan makes no representation and warranty whatsoever and disclaims all liability, for the completeness, accuracy or reliability of the information contained herein. This document is not intended as investment research or investment advice, or a recommendation, offer or solicitation for the purchase or sale of any security, financial instrument, financial product or service, or to be used in any way for evaluating the merits of participating in any transaction, and shall not constitute a solicitation under any jurisdiction or to any person, if such solicitation under such jurisdiction or to such person would be unlawful.

Funding

The authors have no relevant financial or non-financial interests to disclose.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sunandita Patra.

Ethics declarations

Conflict of interest

The authors have no conflict of interest to declare that are relevant to the content of this article. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: Table of symbols

Appendix: Table of symbols

Symbol

Description

COS

Capacity optimization and scheduling

COSPiS

Capacity optimization and scheduling via paired sampling

Det

Deterministic estimator-based constraint programming approach

MILP

Mixed integer linear programming

b

A job represented as a tuple (qfuDJR).

q

The requested start time of job b.

f

Flexibility measure indicating the maximum delay allowed for the start of job b after its requested start time q.

u

The latest completion time (deadline) of job b.

D

A list of recorded durations or running times from historic data of job b’s previous executions.

J

The set of jobs that job b depends on; job b can only start once all jobs in J have been completed.

R

The history of the number of CPU cores utilized by job b.

\(B_n\)

The set of n jobs, each represented as \(b_j\).

\(S_n\)

A schedule for n jobs, represented as \((s_1, s_2, \dots , s_n)\), where \(s_j\) is the scheduled start time of job \(b_j\).

T

The maximum timespan (makespan) within which all jobs need to run.

\(S^*_n\)

The optimal start-time schedule for \(B_n\) within a makespan of T.

\(\{s_j\}^n_{j=1}\)

A set of integer variables where \(s_j\) indicates the start time of job \(b_j \in B_n\).

p

An integer variable indicating the maximum (peak) number of CPU cores used across all jobs at any time \(t \in T\).

\({\hat{b}}_j\)

A job \(b_j\) mapped using a deterministic estimator function \(\varvec{{f^{est}}}\).

\({\hat{d}}_j\), \({\hat{r}}_j\)

Estimations of the duration and CPU usage of job \(b_j\), respectively.

\({\textbf{X}}\), \({\textbf{Y}}\)

Sets of job runtime intervals and resource usages for n jobs, used in the cumulative constraint.

\(f^{est}\)

Estimator function

K

Hyperparameter of COSPiS (number of pair samples)

\(\alpha\)

Hyperparameter of COSPiS (tolerance of job deadline violations)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Patra, S., Pathan, M., Mahfouz, M. et al. Capacity planning and scheduling for jobs with uncertainty in resource usage and duration. J Supercomput 80, 22428–22461 (2024). https://doi.org/10.1007/s11227-024-06282-8

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11227-024-06282-8

Keywords

Navigation