Abstract
With the rapid increment of the heterogeneity of hardware devices, cluster computing has to encounter the problem of handling heterogeneous resources for exploiting the utilization of system resources. This paper introduces a new job allocation strategy based on multi-clusters in diskless environments. By adopting Ganglia as the resource monitor and Condor as the queue system, a heterogeneous multi-cluster system is also constructed with and without storage devices for evaluating the system performance. The proposed algorithm is called the Well-Balanced Allocation Strategy (WBAS) in which the scheduler dispatches MPI-based jobs to appropriate resources across multi-clusters. The strategy focuses on dispatching jobs to nodes with similar performance, thus equalizing execution times among all the required nodes. The WBAS is implemented on the constructed heterogeneous multi-cluster system to evaluate the performance of the scheduling strategy. The experimental results show that the proposed strategy performs well and could efficiently improve the system performance.
Similar content being viewed by others
References
Abawajy JH (2009) An efficient adaptive scheduling policy for high-performance computing. Future Gener Comput Syst 25(3):364–370
Anderson T, Culler D, Patterson D (1995) A case for network of workstations. IEEE Micro 15(1):54–64
Buyya R (1999) High performance cluster computing: system and architectures, vol 1. Prentice Hall, New York
Buyya R (1999) High performance cluster computing: programming and applications, vol 2. Prentice Hall, New York
Cao J, Chan A, Sun Y, Das SK, Guo M (2006) A taxonomy of application scheduling tools for high performance cluster computing. J Clust Comput 9(3):355–371
Chen DZ, Wang YM (2007) The impact of memory resource on loop self-scheduling for heterogeneous clusters. In: CTHPC 2007
Bucur AID, Epema DHJ (2007) Scheduling policies for processor coallocation in multicluster systems. IEEE Trans Parallel Distrib Syst 18(7):958–972
Foster I, Kesselman C (1999) The grid: blueprint for a future computing infrastructure. Morgan Kaufmann, San Mateo
Foster I, Kesselman C, Tuecke S (2001) The anatomy of the grid: Enabling scalable virtual organizations. Int J Supercomput Appl 15(3)
Geist A (1994) Cluster computing: the wave of the future. Lecture notes in computer science, vol 879. Springer, Berlin, pp 236–246
Jones WM, Ligon III WB, Pang L.W., Stanzione D. (2005) Characterization of bandwidth-aware meta-schedulers for co-allocating jobs across multiple clusters. J Supercomput 34(2):135–163
Krueger PE, Livny M (1988) A comparison of preemptive and non-preemptive load distributing. In: Proc of the 8th international conference on distributed computing systems, pp 123–130, June 1988
Matsuda M, Kudoh T, Ishikawa Y (2003) Evaluation of MPI implementations on grid-connected clusters using an emulated WAN environment. In: Proc of the 3rd IEEE/ACM international symposium on cluster computing and the grid (CCGRID’03). IEEE Computing Society, p 10
Mutka M, Livny M (1987) Scheduling remote processing capacity in a workstation-processing bank computing system. In: Proceedings of the 7th international conference of distributed computing systems, pp 2–9, September, 1987
Silberstein M, Geiger D, Schuster A, Livny M (2006) Scheduling mixed workloads in multi-grids: the grid execution hierarchy. In: Proceedings of the 15th IEEE symposium on high performance distributed computing (HPDC), pp 33–40
Sterling TL, Salmon J, Backer DJ, Savarese DF (1999) How to build a beowulf: a guide to the implementation and application of PC clusters, 2nd edn. MIT, Cambridge
Wang Y-M (2006) Memory latency consideration for load sharing on heterogeneous network of workstations. J Syst Archit, EUROMICRO J 52(1):13–20
Werstein P, Situ H, Huang Z (2006) Load balancing in a cluster computer. In: Proceedings of the seventh international conference on parallel and distributed computing, applications and technologies, pp 569–577
Wilkinson B, Allen M (1999) Parallel programming: techniques and applications using networked workstations and parallel computers. Prentice Hall, New York, 1999
Wright D (2001) Cheap cycles from the desktop to the dedicated cluster: Combining opportunistic and dedicated scheduling with Condor. In: Conference on Linux clusters: the HPC revolution, June 2001
Xavier P, Cai W, Lee BS (2006) Workload management of cooperatively federated computing clusters. J Supercomput 36(3):309–322
Yang CT, Chang SC (2004) A parallel loop self-scheduling on extremely heterogeneous PC clusters. J Inf Sci Eng 20(2):263–273
Yang CT, Chen PI, Chen YL (2005) Performance evaluations of SLIM and DRBL diskless PC clusters on Fedora Core 3. In: Proceedings of the 6th IEEE international conference on parallel and distributed computing, applications and technologies (PDCAT 2005), pp 479–482, December 5–8, 2005
Yang CT, Liao CS, Chen PI, Tung HY (2006) An information monitoring and job scheduling system for multiple Linux PC clusters. In: Proceedings of the 7th international conference on parallel and distributed computing, applications and technologies (PDCAT 2006), IEEE CS Press, pp 578–582, Taipei, Taiwan, December 4–7, 2006
Yang CT, Chen PI, Chen SY, Tung HY (2006) A jobs’ allocation strategy for multiple DRBL diskless Linux clusters with Condor schedulers. In: Proceedings of the 5th international conference on grid and cooperative computing (GCC 2006), IEEE CS Press, pp 54–57, China, Oct 2006
Yang CT, Chen PI, Hu YC, Tung HY, Ke C-C (2006) On utilization of multiple DRBL-based Linux clusters in the computer classroom to grid computing environments. In: Proceedings of the 12th workshop on compiler techniques for high-performance computing (CTHPC 2006), pp 36–41, Tainan, Taiwan, March 16–17, 2006
Yang CT, Chen TT, Tung HY (2007) A dynamic domain-based network information model for computational grids. In: Future generation communication and networking (FGCN 2007), pp 575–578, Jeju-Island, Korea, December 6–8, 2007
MPI Forum (1994) MPI: A message-passing interface standard. Int J Supercomput Appl 8(3/4):165–416
Ganglia, http://ganglia.info/
LAM/MPI Parallel Computing, http://www.lam-mpi.org/
Message Passing Interface Forum, http://www.mpi-forum.org/
PVM—Parallel Virtual Machine, http://www.epm.ornl.gov/pvm
Arabnia HR, Oliver MA (1987) Arbitrary rotation of raster images with SIMD machine architectures. Int J Eurographics Assoc, Comput Graph Forum 6(1):3–12
Bhandarkar SM, Arabnia HR, Smith JW (1995) A reconfigurable architecture for image processing and computer vision. Int J Pattern Recognit Artif Intell 9(2):201–229 (special issue on VLSI Algorithms and Architectures for Computer Vision, Image Processing, Pattern Recognition and AI)
Bhandarkar SM, Arabnia HR (1995) The Hough transform on a reconfigurable multi-ring network. J Parallel Distrib Comput 24(1):107–114
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, CT., Lai, KC. & Tung, HY. On construction of a well-balanced allocation strategy for heterogeneous multi-cluster computing environments. J Supercomput 56, 270–299 (2011). https://doi.org/10.1007/s11227-009-0369-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-009-0369-3