Computing BLAS level-2 operations on workstation clusters using the divisible load paradigm*,*

https://doi.org/10.1016/j.mcm.2004.01.004Get rights and content
Under an Elsevier user license
open archive

Abstract

The problem of executing large BLAS (basic linear algebra subprograms) Level-2 operations, such as matrix-vector products, in a network-based distributed computing environment composed of a bus-oriented workstation cluster is considered. Unlike previous contributions, we take into account the fact that workstations, as against mainframe computers, are not equipped with communication coprocessors or front-ends, precluding any possibility of communication off-loading. Communication delays, which are significant in workstation clusters due to limited bandwidth availability, are specifically accounted for. This aspect is generally ignored in most performance analysis of parallel computing systems. The important contribution of this study is to show that the optimal load partitioning, and the subsequent performance of the network, depends critically on network bandwidth, computing capacity, and load characteristics. We design load distribution strategies for three cases (no communication, broadcast communication, and multicast communication) based on closed-form solutions of the optimal load partitioning problem and also present extensive and complete asymptotic analysis with respect to several parameters of the load and the system. Necessary and sufficient conditions for feasible and optimal load sharing are also derived. A trade-off study between the optimal number of workstations and the bandwidth of the bus is also presented.

Keywords

Divisible loads
BLAS Level-2 operations
Matrix-vector products
Optimal load distribution
Load partitioning

Cited by (0)

*

The work reported in the paper was in part supported by Brain Korea 21 Project, Kangwon National University, and Media Service Research Center (MSRC-ITRC) under the auspices of the Ministry of Information and Communication, Korea.

*

he authors would like to thank P. B. Sujit for generating the plots in the paper.