Skip to main content
Log in

On/Off-Line Prediction Applied to Job Scheduling on Non-Dedicated NOWs

  • Regular Paper
  • Published:
Journal of Computer Science and Technology Aims and scope Submit manuscript

Abstract

This paper proposes a prediction engine designed for non-dedicated clusters, which is able to estimate the turnaround time for parallel applications, even in the presence of serial workload of the workstation owner. The prediction engine can be configured to work with three different estimation kernels: a Historical kernel, a Simulation kernel based on analytical models and an integration of both, named Hybrid kernel. These estimation proposals were integrated into a scheduling system, named CISNE, which can be executed in an on-line or off-line mode. The accuracy of the proposed estimation methods was evaluated in relation to different job scheduling policies in a real and a simulated cluster environment. In both environments, we observed that the Hybrid system gives the best results because it combines the ability of a simulation engine to capture the dynamism of a non-dedicated environment together with the accuracy of the historical methods to estimate the application runtime considering the state of the resources.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. Acharya A, Setia S. Availability and utility of idle memory in workstation clusters. In Proc. the ACM SIGMET-RICS/PERFORMANCE1999, Atlanta, USA, May 1-4, 1999, pp. 35–46.

  2. Kuo C H. A study of resource allocation for non-dedicated distributed shared memory systems [M.S. Thesis]. “National Cheng-Kung University” 2004.

  3. Mahanti J, Eager D L. Adaptive data parallel computing on workstation clusters. Journal of Parallel and Distributed Computing, 2004, 64(11): 1241–1255.

    Article  MATH  Google Scholar 

  4. Stava M, Tvrdik P. Overlapping non-dedicated clusters architecture. In Proc. Int. Conf. Computer Engineering and Technology, Singapore, Jan. 22-24, 2009, pp. 3–10.

  5. Litzkow M, Livny M, Mutka M. Condor — A hunter of idle workstations. In Proc. the 8th Int. Conference of Distributed Computing Systems, San Jose, USA, Jun. 13-17, 1988, pp. 104–111.

  6. Chowdhury A, Nicklas L, Setia S, White E. Supporting dynamic space-sharing on non-dedicated clusters of workstations. In Proc. the 17th International Conference on Distributed Computing Systems (ICDCS 1997), Baltimore, USA, May 27-30, 1997, pp. 149–158.

  7. Goscinski A M, Wong A. A study of the concurrent execution of parallel and sequential applications on a non-dedicated cluster. Parallel Computing, 2008, 34(2): 69–91.

    Article  Google Scholar 

  8. Hanzich M, Giné F, Hernández P, Solsona F, Luque E. CISNE: A new integral approach for scheduling parallel applications on non-dedicated clusters. In Proc. EuroPar 2005, Lisbon, Portugal, Aug. 30-Sept. 2, 2005, pp. 220–230.

  9. Urgaonkar B, Shenoy P. Sharc: Managing CPU and networks bandwidth in shared clusters. IEEE Transactions on Parallel and Distributed Systems, 2004, 15(1): 2–17.

    Article  Google Scholar 

  10. Harchol-Balter M, Li C, Osogami T, Scheller-Wolf A, Squillante M S. Cycle stealing under immediate dispatch task assignment. In Proc. the 15th Annual ACM Symp. Parallel Algorithms and Architectures, San Diego, USA, Jun. 7-9, 2003, pp. 274–285.

  11. Lafreniere B J, Sodan A C. Scopred — Scalable user-directed performance prediction using complexity modeling and historical data. In Proc. Workshop on Job Scheduling Strategies for Parallel Processing, Cambridge, USA, Jun. 19, 2005, pp. 62–90.

  12. Downey A B. Predicting queue times on space-sharing parallel computers. In Proc. the 11th International Symposium on Parallel Processing (IPPS 1997), San Juan, Puerto Rico, Apr. 12-16, 1997, pp. 209–218.

  13. Gibbons R. A historical application profiler for use by parallel schedulers. In Proc. Workshop on Job Scheduling Strategies for Parallel Processing, Geneva, Switzerland, Apr. 5, 1997, pp. 58–77.

  14. Smith W, Foster I, Taylor V. Predicting application run times with historical information. Journal of Parallel and Distributed Computing, 2004, 64: 1007–1016.

    Article  MATH  Google Scholar 

  15. Wolski R. Experiences with Predicting resource performance on-line in computational grid settings. ACM SIGMETRICS Performance Evaluation Review, 2003, 30(4): 41–49.

    Article  Google Scholar 

  16. Yang L, Schopf J M, Foster I. Conservative scheduling: Using predicted variance to improve scheduling decisions in dynamic environments. In Proc. Supercomputing, Phoenix, USA, Nov. 15-21, 2003, pp. 262–273.

  17. Kerbyson D J, Harper J S, Craig A, Nudd G R. PACE: A toolset to investigate and predict performance in parallel systems. In Proc. European Parallel Tools Meeting, Onera, France, Oct. 23, 1996.

  18. Jarvis S A, Spoone D Pr, H N Lim Choi Keung, Cao J, Saini S, Nudd G R. Performance prediction and its use in parallel and distributed computing systems. Future Generation Computer Systems Special Issue on System Performance Analysis and Evaluation, 2004, 22(7): 745–754.

    Google Scholar 

  19. Hanzich M, Hernandez P, Luque E, Gine F, F Solsona, Lerida J L. Using simulation, historical and hybrid estimation systems for enhancing job scheduling on NOWs. In Proc. IEEE International Conference on Cluster Computing, Barcelona, Spain, Sept. 25-28, 2006, pp. 1–12.

  20. Li H, Groep D, Templon J, Wolters L. Predicting job start times on clusters. In Proc. the 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004), Chicago, USA, Apr. 19-22, 2004, pp. 301–308.

  21. Smith W, Wong P. Resource selection using execution and queue wait time predictions. NAS Technical Reports, 2002.

  22. Mu’alem A W, Feitelson D G. Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Transaction on Parallel & Distributed Systems, 2001, 12(6): 529–543.

    Article  Google Scholar 

  23. Nissimov A, Feitelson D G. Probabilistic backfilling. In Proc. JSSPP 2007, Seattle, USA, Jun. 17, 2007, pp. 102–115.

  24. Zhang Y, Franke H, Moreira J E, Sivasubramaniam A. An integrated approach to parallel scheduling using gang-scheduling, backfilling and migration. IEEE Transactions on Parallel and Distributed Systems, 2003, 14(3): 236–247.

    Article  Google Scholar 

  25. Talby D, Feitelson D G. Improving and stabilizing parallel computer performance using adaptive backfilling. In Proc. the 19th IEEE Int. Parallel and Distributed Processing Symposium (IPDPS 2005), Denver, USA, Apr. 4-8, 2005.

  26. Tsafrir D, Etsion Y, Feitelson D G. Backfilling using system-generated predictions rather than user runtime estimates. IEEE Transactions on Parallel and Distributed Systems, June 2007, 18(6): 789–803.

    Article  Google Scholar 

  27. He L, Jarvis S A, Spooner D P, Nudd G R. Dynamic, capability-driven scheduling of dag-based real-time jobs in heterogeneous clusters. International Journal of High Performance Computing and Networking, 2004, 2(2-4): 165–177.

    Article  Google Scholar 

  28. Dinda P A. Design, implementation, and performance of an extensible toolkit for resource prediction in distributed systems. IEEE Transactions on Parallel and Distributed Systems, 2006, 17(2): 160–173.

    Article  Google Scholar 

  29. Lin B, Sundarara A I, Dinda P A. Time-sharing parallel applications with performance isolation and control. In Proc. International Conference on Autonomic Computing, Jouksonville, USA, Jun. 11-15, 2007, p. 28.

  30. Brevik J, Nurmi D,Wolski R. Using model-based clustering to improve predictions for queueing delay on parallel machines. Parallel Processing Letters (PPL), Jan. 2007, 17(1): 21–46.

    Article  MathSciNet  Google Scholar 

  31. Shmueli E, Feitelson D G. Backfilling with lookahead to optimize the performance of parallel job scheduling. In Proc. Workshop on Job Scheduling Strategies for Parallel Processing, Seattle, USA, Jun. 24, 2003, pp. 228–251.

  32. Srinivasan S, Kettimuthu R, Subrarnani V, Sadayappan P. Characterization of back¯lling strategies for parallel job scheduling. In Proc. International Conference on Parallel Processing Workshops (ICPPW2002), Vancouver, Canada, Aug. 20-23, 2002, pp. 514–522.

  33. Arpaci R H, Dusseau A C, Vahdat A M, Liu L T, Anderson T E, Patterson D A. The interaction of parallel and sequential workloads on a network of workstations. In Proc. the ACM SIGMETRICS/PERFORMANCE1995, 1995, pp. 267–277.

  34. Giné F, Solsona F, Hanzich M, Hernández P, Luque E. Cooperating coscheduling: A coscheduling proposal aimed at mon-Dedicated heterogeneous NOWs. Journal of Computer Science and Technology, 2007, 22(5): 695–710.

    Article  Google Scholar 

  35. Hanzich M, Giné F, Hernández P, Solsona F, Luque E. A space and time sharing scheduling approach for PVM non-dedicated clusters. In Proc. EuroPVM/MPI 2005, Sorrento, Italy, Sept. 18-21, 2005, pp. 379–387.

  36. Mutka M, Livny M. The available capacity of a privately owned workstation environment. J. Performance Evaluation, 1991, 12(4): 269–284.

    Article  MATH  Google Scholar 

  37. Bailey D H, Barszcz E, Barton J T, Browning D S, Carter R L, Dagum D, Fatoohi R A, Frederickson P O, Lasinski T A, Schreiber R S, Simon H D, Venkatakrishnan V, Weeratunga S K. The NAS parallel benchmarks. The International Journal of Supercomputer Applications, 1991, 5(3): 63–73.

    Google Scholar 

  38. Lublin U, Feitelson D G. The workload on parallel supercomputers: Modeling the characteristics of rigid jobs. J. Parallel Distrib. Comput., 2003, 63(11): 1105–1122.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mauricio Hanzich.

Additional information

This work was supported by the MEyC under Grant No. TIN 2008-05913.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hanzich, M., Hernández, P., Giné, F. et al. On/Off-Line Prediction Applied to Job Scheduling on Non-Dedicated NOWs. J. Comput. Sci. Technol. 26, 99–116 (2011). https://doi.org/10.1007/s11390-011-9418-5

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11390-011-9418-5

Keywords

Navigation