skip to main content
10.1145/1048935.1050182acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
Article

Conservative Scheduling: Using Predicted Variance to Improve Scheduling Decisions in Dynamic Environments

Published:15 November 2003Publication History

ABSTRACT

In heterogeneous and dynamic environments, efficient execution of parallel computations can require mappings of tasks to processors whose performance is both irregular (because of heterogeneity) and time-varying (because of dynamicity). While adaptive domain decomposition techniques have been used to address heterogeneous resource capabilities, temporal variations in those capabilities have seldom been considered. We propose a conservative scheduling policy that uses information about expected future variance in resource capabilities to produce more efficient data mapping decisions. We first present techniques, based on time series predictors that we developed in previous work, for predicting CPU load at some future time point, average CPU load for some future time interval, and variation of CPU load over some future time interval. We then present a family of stochastic scheduling algorithms that exploit such predictions of future availability and variability when making data mapping decisions. Finally, we describe experiments in which we apply our techniques to an astrophysics application. The results of these experiments demonstrate that conservative scheduling can produce execution times that are both significantly faster and less variable than other techniques.

References

  1. {1} http://cs.uchicago.edu/~lyang/Load.Google ScholarGoogle Scholar
  2. {2} http://trochim.human.cornell.edu/kb/stat_t.htm.Google ScholarGoogle Scholar
  3. {3} Allen, G., Benger, W., Dramlitsch, T., Goodale, T., Hege, H.-C., Lanfermann, G., Merzky, A., Radke, T., Seidel, E. and Shalf, J., Cactus Tools for Grid Applications, Cluster Computing, 4 (2001) 179-188. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. {4} Allen, G., Benger, W., Goodale, T., Hege, H.-C., Lanfermann, G., Merzky, A., Radke, T., Seidel, E. and Shalf, J., The Cactus Code: A Problem Solving Environment for the Grid. Proceedings of the Ninth IEEE International Symposium on High Performance Distributed Computing (HPDC9), Pittsburgh, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. {5} Allen, G., Goodale, T., Lanfermann, G., Radke, T., Seidel, E., Benger, W., Hege, H.-C., Merzky, A., Masso, J. and Shalf, J., Solving Einstein's Equations on Supercomputers, IEEE Computer, 32 (1999) 52-58. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. {6} Arabe, J.N.C., Beguelin, A., Loweamp, B., Seligman, E., Starkey, M. and Stephan, P., Dome: Parallel Programming in a Heterogeneous Multi-user Environment. Carnegie Mellon University, School of Computer Science, 1995.Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. {7} Berman, F., Wolski, R., Figueira, S., Schopf, J. and Shao, G., Application-Level Scheduling on Distributed Heterogeneous Networks. Supercomputing'96, 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. {8} Czajkowski, K., Foster, I., Kesselman, C., Sander, V. and Tuecke, S., SNAP: A Protocol for Negotiating Service Level Agreements and Coordinating Resource management in Distributed Systems. 8th Workshop On Job Scheduling Strategies for Parallel Processing, Edinburgh, Scotland, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. {9} Dail, H.J., A Modular Framework for Adaptive Scheduling in Grid Application Development Environments. Computer Science, University of California, California, San Diego, 2001.Google ScholarGoogle Scholar
  10. {10} Dinda, P.A., Online Prediction of the Running Time of Tasks, Cluster Computing, 5 (2002). Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. {11} Dinda, P.A., A Prediction-based Real-time Scheduling Advisor. Proceedings of the 16th International Parallel and Distributed Processing Symposium (IPDPS 2002), 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. {12} Dinda, P.A. and O'Hallaron, D.R., The Statistical Properties of Host Load. The Fourth Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR 98), Pittsburgh, PA, 1998, pp. 319-334. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. {13} Dinda, P.A. and O'Hallaron, D.R., Host Load Prediction Using Linear Models, Cluster Computing, 3 (2000). Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. {14} Dinda, P.A. and O'Hallaron, D.R., Realistic CPU Workloads Through Host Load Trace Playback. Proc. 5th Workshop on Languages, Compilers, and Run-time Systems for Scalable Computers (LCR 2000), Springer LNCS 1915, Rochester, NY, 2000, pp. 265-280. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. {15} Figueira, S.M. and Berman, F., Mapping Parallel Applications to Distributed Heterogeneous Systems. University of Californian, San Diego, 1996.Google ScholarGoogle Scholar
  16. {16} Fox, G.C., Johnson, M.A., Lyzenga, G.A., Otto, S.W., Salmon, J.K. and Walker, D.W., Solving Problems on Concurrent Processors, Prentice-Hall, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. {17} Fox, G.C., Williams, R.D. and Messina, P.C., Parallel Computing Works, Morgan Kaufmann, 1994, 977 pp.Google ScholarGoogle Scholar
  18. {18} Gehring, J. and Reinefeld, A., Mars: A Framework for Minimizing the Job Execution Time in a Metacomputing Environment, Future Generation Computer Systems, 12(1) (1996) 87-99. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. {19} Kumar, S., Das, S.K. and Biswas, R., Graph Partitioning for Parallel Applications in Heterogeneous Grid Environments. submitted to International Parallel and Distributed Processing Symposium (IPDPS 2002), Florida, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {20} Kumar, S., Maulik, U., Bandyopadhyay, S. and Das, S.K., Efficient Task Mapping on Distributed Heterogeneous System for Mesh Applications. International workshop on Distributed Computing (IWDC 2001), Calcutta, India, 2001.Google ScholarGoogle Scholar
  21. {21} Liu, C., Yang, L., Foster, I. and Angulo, D., Design and Evaluation of a Resource Selection Framework for Grid Applications. Proceedings of the 11th IEEE International Symposium on High-Performance Distributed Computing (HPDC 11), Edinburgh, Scotland, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. {22} Ripeanu, M., Iamnitchi, A. and Foster, I., Performance Predictions for a Numerical Relativity Package in Grid Environments, International Journal of High Performance Computing Applications, 15 (2001). Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. {23} Schopf, J.M., Performance Prediction and Scheduling for Parallel Applications on Multi-User Cluster. Department of Computer Science and Engineering, University of California San Diego, San Diego, 1998, pp. 247.Google ScholarGoogle Scholar
  24. {24} Schopf, J.M., A Practical Methodology for Defining Histograms for Predictions and Scheduling. ParCo'99, 1999.Google ScholarGoogle Scholar
  25. {25} Schopf, J.M. and Berman, F., Stochastic Scheduling. SuperComputing'99, Portland, Oregon, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. {26} Smith, W., Foster, I. and Taylor, V., Predicting Application Run Times Using Historical Information. Proceedings of the IPPS/SPDP'98 Workshop on Job Scheduling Strategies for Parallel Processing, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. {27} Turgeon, A., Snell, Q. and Clement, M., Application Placement Using Performance Surface. HPDC2000, Pittsburgh, Pennsylvania, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. {28} Vazhkudai, S., Schopf, J.M. and Foster, I., Predicting the Performance of Wide Area Data Transfer. 16th Int'l Parallel and Distributed Processing Symposium (IPDPS 2002), Fort Lauderdale, Florida, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  29. {29} Weissman, J.B. and Zhao, X., Scheduling Parallel Applications in Distributed Networks, Journal of Cluster Computing, 1 (1998) 109-118. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. {30} Wolski, R., Dynamically Forecasting Network Performance Using the Network Weather Service, Journal of Cluster Computing (1998). Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. {31} Wolski, R., Spring, N. and Hayes, J., The Network Weather Service: A Distributed Resource Performance Forecasting Service for Metacomputing, Journal of Future Generation Computing Systems (1998) 757-768. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {32} Yang, L., Foster, I. and Schopf, J.M., Homeostatic and Tendency-based CPU Load Predictions. International Parallel and Distributed Processing Symposium (IPDPS2003), Nice, France, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. {33} Yang, Y. and Casanova, H., RUMR: Robust Scheduling for Divisible Workloads. Proceedings of the 12th IEEE Symposium on High Performance and Distributed Computing (HPDC-12), Seattle, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. {34} Yang, Y. and Casanova, H., UMR: A Multi-Round Algorithm for Scheduling Divisible Workloads. Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS'03), Nice, France, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    SC '03: Proceedings of the 2003 ACM/IEEE conference on Supercomputing
    November 2003
    859 pages
    ISBN:1581136951
    DOI:10.1145/1048935

    Copyright © 2003 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 15 November 2003

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • Article

    Acceptance Rates

    SC '03 Paper Acceptance Rate60of207submissions,29%Overall Acceptance Rate1,516of6,373submissions,24%

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader