Abstract
In this paper, we present a scheduling scheme to estimate the turnaround time of parallel jobs on a heterogeneous and non-dedicated cluster or NoW (Network of Workstations). This scheme is based on an analytical prediction model that establishes the processing and communication slowdown of the execution times of the jobs based on the cluster nodes and links powerful and occupancy. Preservation of the local application responsiveness is also a goal.
We address the impact of inaccuracies in these estimates on the overall system performance. Furthermore, we demonstrate that job scheduling benefits from the accuracy of these estimates. The applicability of our proposal has been proved by measuring the efficiency of our method by comparing the predicted deviations of the parallel jobs in a real environment with respect to the most representative ones of the literature.
The additional cost of obtaining these was also evaluated and compared. The present work is implemented within the CISNE project, a previously developed scheduling framework for non-dedicated and heterogeneous cluster environments.
This work was supported by the MEyC-Spain under contract TIN2007-64974.
Chapter PDF
References
Acharya, A., Setia, S.: Availability and utility of idle memory in workstation clusters. In: Proceedings of the ACM SIGMETRICS 1999, pp. 35–46 (1999)
Buyya, R., Abramson, D., Giddy, J.: Nimrod/G: An Architecture of a Resource Management and Scheduling System in a Global Computational Grid. ArXiv Computer Science e-prints (2000)
Etsion, Y., Tsafrir, D., Feitelson, D.G.: Backfilling using system-generated predictions rather than user runtime estimates. IEEE Trans. Parallel & Distributed Syst. 18(6), 789–803 (2007)
Downey, A.: Predicting queue times on space-sharing parallel computers. In: 11th Intl. Parallel Processing Symp., pp. 209–218 (1997)
Aridor, Y., Yom-Tov, E.: A self-optimized job scheduler for heterogeneous server clusters. In: Frachtenberg, E., Schwiegelshohn, U. (eds.) JSSPP 2007. LNCS, vol. 4942. Springer, Heidelberg (2008)
Hanzich, M., Giné, F., Hernández, P., Solsona, F., Luque, E.: Using on-the-fly simulation for estimating the turnaround time on non-dedicated clusters. In: Nagel, W.E., Walter, W.V., Lehner, W. (eds.) Euro Par 2006. LNCS, vol. 4128, pp. 117–187. Springer, Heidelberg (2006)
Harchol-Balter, M., Li, C., Osogami, T., Scheller-Wolf, A., Squillante, M.S.: Cycle stealing under immediate dispatch task assignment. In: Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures, pp. 274–285 (2003)
Jarvis, S., Spooner, D., Keung, H.L.C., Cao, J., Saini, S., Nudd, G.: Performance prediction and its use in parallel and distributed computing systems. Future Gener. Comput. Syst. 22(7), 745–754 (2006)
Javadi, B., Abawajy, J.: Performance analysis of heterogeneous multi-cluster systems. In: Proceedings of the 2005 International Conference on Parallel Processing Workshops (ICPPW 2005), Washington, DC, USA, pp. 493–500 (2005)
Jones, W.: The impact of error in user-provided bandwidth estimates on multi-site parallel job scheduling performance. In: The 19th IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2007), Cambridge, Massachusetts (November 2007)
Jones, W., Ligon, W., Pang, L., Stanzione, D.: Characterization of bandwidth-aware meta-schedulers for co-allocating jobs across multiple clusters. The Journal of Supercomputing 34(2), 135–163 (2005)
Lafreniere, B., Sodan, A.: Scopred—scalable user-directed performance prediction using complexity modeling and historical data. In: Feitelson, D.G., Frachtenberg, E., Rudolph, L., Schwiegelshohn, U. (eds.) JSSPP 2005. LNCS, vol. 3834, pp. 62–90. Springer, Heidelberg (2005)
Li, H., Groep, D., Templon, J., Wolters, L.: Predicting job start times on clusters. In: 4th IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGrid 2004) (April 2004)
Smith, W., Taylor, V., Foster, I.: Using run-time predictions to estimate queue wait times and improve scheduler performance. In: Feitelson, D.G., Rudolph, L. (eds.) JSSPP 1999, IPPS-WS 1999, and SPDP-WS 1999. LNCS, vol. 1659, pp. 202–219. Springer, Heidelberg (1999)
Urgaonkar, B., Shenoy, P.: Sharc: Managing cpu and network bandwidth in shared clusters. IEEE Trans. Parallel Distrib. Syst. 15(1), 2–17 (2004)
Wolski, R.: Experiences with predicting resource performance on-line in computational grid settings. ACM SIGMETRICS Performance Evaluation Review 30(4), 41–49 (2003)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lérida, J.L., Solsona, F., Giné, F., García, J.R., Hanzich, M., Hernández, P. (2008). Enhancing Prediction on Non-dedicated Clusters. In: Luque, E., Margalef, T., Benítez, D. (eds) Euro-Par 2008 – Parallel Processing. Euro-Par 2008. Lecture Notes in Computer Science, vol 5168. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85451-7_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-85451-7_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85450-0
Online ISBN: 978-3-540-85451-7
eBook Packages: Computer ScienceComputer Science (R0)