Abstract
In this paper we introduce the Divisible Load Scheduling (DLS) family of algorithms for data-intensive applications. The polynomial time algorithms partition the input data and generate optimal mappings to collection of autonomous and heterogeneous computational systems. We prove the optimality of the solution and report a simulation study of the algorithms.
Similar content being viewed by others
References
Altilar, D., Paker, Y.: An optimal scheduling algorithm for parallel video processing. In: IEEE Int. Conference on Multimedia Computing and Systems. IEEE Computer Society, Silver Spring (1998)
Atallah, M.J., Black, C.L., Marinescu, D.C., Siegel, H.J., Casavant, T.L.: Models and algorithms for co-scheduling compute-intensive tasks on a network of workstations. J. Parallel Distrib. Comput. 16(4), 319–327 (1992)
Baraglia, R., Ferrini, R., Tonellotto, N., Ricci, L., Yahyapour, R.: A launch-time scheduling heuristics for parallel applications on wide area Grids. J. Grid Computing 6(2), 159–175 (2008)
Bataineh, S., Robertazzi, T.G.: Distributed computation for a bus network with communication delays. In: Proc. Conf. Information Sciences and Systems, Baltimore, MD (1991)
Beaumont, O., Casanova, H., Legrand, A., Robert, Y., Yang, Y.: Scheduling divisible loads on star and tree networks: results and open problems. IEEE Trans. Parallel Distrib. Syst. 16(3), 207–218 (2005)
Bharadwaj, V., Ghose, D., Mani, V., Robertazzi, T.: Scheduling Divisible Loads in Parallel and Distributed Systems. IEEE Computer Society, Silver Spring (1996)
Bharadwaj, V., Ghose, D., Robertazzi, T.G.: Divisible Load Theory: a new paradigm for load scheduling in distributed systems. In: Cluster Computing on Divisible Load Scheduling, vol, 6, no. 1, pp. 7–18 (2003)
Blazewicz, J., Drozdowski, M., Markiewicz, M.: Divisible task scheduling—concept and verification. Parallel Comput. 25, 87–98 (1999)
Blazewicz, J., Drozdowski, M.: Scheduling divisible jobs on hypercubes. Parallel Comput. 21, 1945–1956 (1995)
Blazewicz, J., Drozdowski, M.: The performance limits of a two-dimensional network of load-sharing processors. Found. Comput. Decis. Sci. 21(1), 3–15 (1996)
Braun, T.D., Siegel, H.J., Beck, N., Boloni, L.L., Maheswaran, M., Reuther, A.I., Robertson, J.P., Theys, M.D., Yao, B., Hensgen, D., Freund, R.F.: A comparison of eleven static heuristics for mapping a class of independent tasks onto heterogeneous distributed computing systems. J. Parallel Distrib. Comput. 61(6), 810–837 (2001)
Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in Grid environments. In: Proceedings of the 9th Heterogeneous Computing Workshop (HCW00), pp. 349–363 (2000)
Cheng, Y.-C., Robertazzi, T.G.: Distributed computation with communication delay. IEEE Trans. Aerosp. Electron. Syst. 24, 700–712 (1988)
Cheng, Y.-C., Robertazzi, T.G.: Distributed computation for a tree network with communication delays. IEEE Trans. Aerosp. Electron. Syst. 26(3), 511–516 (1990)
Cohen, B.: BitTorrent Protocol Specification. http://www.bittorrent.org/protocol.html (2008)
Darema-Rodgers, F., Norton, V.A., Pfister, G.F.: Using a single-program-multiple-data computational model for parallel execution of scientific applications. Technical Report RC11552, IBM T.J Watson Research Center (1985)
Foster, I., Kesselman, C.: The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann Publishers, ISBN 1-55860-475-8 (2000)
Grid Infrastructure Group: TeraGrid. http://www.teragrid.org/ (2009)
Hong, Q., Ju, J.: Cooperative task scheduling on workstations network. J. Softw. 9(1), 14–17 (1998)
Jacobson, V.: Congestion avoidance and control. In: Proceedings of ACM SIGCOMM ’88 (1988)
Ji, Y., Marinescu, D.C., Zhang, W., Zhang, X., Yan, X., Baker, T.S.: A model-based parallel origin and orientation refinement algorithm for CryoTEM and its application to the study of virus structures. J. Struct. Biol. 154(1), 1–19 (2006)
Karatza, H.D.: Gang scheduling and I/O scheduling in a multiprocessor system. In: Proc. Symp. on Performance Evaluation of Computer and Telecommunication Systems (SCSI), pp. 245–252 (2000)
Kim, S., Weissman, J.B.: A genetic algorithm-based approach for scheduling decomposable data Grid applications. In: Proc. 33rd Int’l Conf. Parallel Processing (ICPP04), vol. 1, pp. 406–413 (2004)
Lee, C., Hamdi, M.: Parallel image processing applications on a network of workstations. Parallel Comput. 21, 137–160 (1995)
Legrand, A., Su, A., Vivien, F.: Minimizing the stretch when scheduling flows of biological requests. Research Report RR2005-48. Ecole Normale Superieure de Lyon (2005)
Matthews, W., Cottrell, L.: Achieving high data throughput in research networks. In: CHEP 2001, China (2001)
Mathis, M., Semke, J., Mahdavi, J.: The macroscopic behaviour of the TCP congestion avoidance algorithm. Comput. Commun. Rev. 27(3), 62–82 (1997)
McClatchey, R., Anjum, A., Stockinger, H., Ali, A., Willers, I., Thomas, M.: Data intensive and network aware (DIANA) Grid scheduling. J. Grid Comput. 5, 43–64 (2007)
Moges, M.A., Robertazzi, T.G.: Grid scheduling divisible loads from multiple sources via linear programming. In: IASTED International Conference on Parallel and Distributed Computing and Systems (PDCS 2004). Cambridge, MA (2004)
Plastino, A., Ribeiro, C.C., Rodriguez, N.: Developing SPMD applications with load balancing. Parallel Comput. 29(6), 743–766 (2003)
Renard, H., Robert, Y., Vivien, F.: Static load-balancing techniques for iterative computations on heterogeneous clusters. Technical Report RR-2003-12, LIP, ENS Lyon, France (2003)
Smallen, S., Casanova, H., Berman, F.: Tunable on-line parallel tomography. In: Proceedings of SuperComputing ’01, Denver, CO (2001)
Steinmetz, R., Wehrle, K.: Peer-to-peer systems and applications. In: Lecture Notes in Computer Science, vol. 3485. ISBN 3-540-29192-X (2005)
Stevens, W.R.: TCP Slow Start, Congestion Avoidance, Fast Retransmit, and Fast Recovery Algorithms. The Internet Society (RFC2001) (1997)
Thain, D., Tannenbaum, T., Livny, M. (2003) Condor and the Grid. In: Grid Computing: Making the Global Infrastructure a Reality. Wiley, New York (2003)
Topcuouglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
van der Raadt, K., Yang, Y., Casanova, H.: APSTDV: divisible load scheduling and deployment on the Grid. Technical Report CS2004-0785, Dept. of Computer Science and Engineering, University of California, San Diego (2004)
Viswanathan, S., Veeravalli, B., Robertazzi, T.G.: Resource-aware distributed scheduling strategies for large-scale computational cluster/Grid systems. IEEE Trans. Parallel Distrib. Syst. 18, 1450–1461 (2007)
Weissman, J.B.: Prophet: automated scheduling of SPMD programs in workstation networks. In: Concurrency: Practice and Experience, vol. 11, pp. 301–321 (1999)
Wolski, R., Spring, N., Hayes, J.: Predicting the CPU availability of time-shared unix systems. In: Proceedings of 8th IEEE High Performance Distributed Computing Conference (HPDC8) (1999)
Wolski, R., Spring, N.T., Hayes, J.: The network weather service: a distributed resource performance forecasting service for metacomputing. Future Gener. Comput. Syst. 15(5,6), 757–768 (1999)
Wong, H.M., Yu, D., Veeravalli, B., Robertazzi, T.G.: Data-intensive Grid scheduling: multiple sources with capacity constraints. In: Proc. 16th Int’l Conf. Parallel and Distributed Computing and Systems (PDCS03), pp. 7–11 (2003)
Wong, H.M., Veeravalli, B., Barlas, G.: Design and performance evaluation of load distribution strategies for multiple divisible loads on heterogeneous linear daisy chain networks. J. Parallel Distrib. Comput. 65(12), 1558–1577 (2005)
Yang, Y., Casanova, H.: Multi-round algorithm for scheduling divisible workload applications: analysis and experimental evaluation. Technical Report CS2002-0721, Dept. of Computer Science and Engineering, University of California, San Diego (2002)
Yu, C., Marinescu, D.C., Siegel, H.J., Morrison, J.P.: A simulation study of data partitioning algorithms for multiple clusters. In: 7th IEEE Int. Symp. on Cluster Computing and the Grid (CCGrid 2007), Brazil (2007)
Yu, C., Marinescu, D.C., Morrison, J.P., Clayton, B.C., Power, D.A.: An automated data processing pipeline for virus structure determination at high resolution. In: 6th Int. Workshop on High Performance Structural Biology (HiCOMB), Long Beach, CA, USA (2007)
Yu, C., Marinescu, D.C.: Load distribution and co-termination scheduling algorithms for large-scale distributed applications. In; ISCA 21st International Conference on Parallel and Distributed Computing and Communication Systems (PDCCS 2008), New Orlean, LA (2008)
Yu, D., Robertazzi, T.: Divisible load scheduling for Grid computing. In: 15th Int’l Conf. Parallel and Distributed Computing and Systems (PDCS2003). IASTED, Anaheim (2003)
Zhu, T., Wu, Y., Yang, G.: Scheduling divisible loads in the dynamic heterogeneous Grid environment. In: Proceedings of the 1st International Conference on Scalable Information Systems, Hong Kong (2006)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yu, C., Marinescu, D.C. Algorithms for Divisible Load Scheduling of Data-intensive Applications. J Grid Computing 8, 133–155 (2010). https://doi.org/10.1007/s10723-009-9129-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-009-9129-0