Abstract
The problem of load balancing when executing parallel programs on computational systems with distributed memory is currently of great interest. The most general statement of this problem is that for one parallel loop: execution of a heterogeneous loop on a heterogeneous computational system. When stated in this way, the problem is NP-complete even in the case of two nodes, and no acceptable heuristics for solving it are found. Since the development of heuristics is a rather complicated task, we decided to examine the problem by elementary methods in order to refine (and, possibly, simplify) the original problem statement. The results of our studies are discussed in this paper. Estimates of efficiency of parallel loop execution as functions of the number of nodes of homogeneous and heterogeneous parallel computational systems are obtained. These estimates show that the use of heterogeneous parallel systems reduces the efficiency even in the case when their communication subsystems are scaleable (see the definition in Section 4). The use of local networks (heterogeneous parallel computational systems with nonscaleable communication subsystems) for parallel computations with heavy data exchange is not advantageous and is possible only for a small number of nodes (about five). An algorithm of optimal distribution of data between the nodes of a homogeneous or heterogeneous computational system is suggested. Results of numerical experiments substantiate the conclusions obtained.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.REFERENCES
Schikuta, E. and Stockinger, H., Parallel Input/Output for Clusters: Methodologies and Systems, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, p. 442.
Avetisyan, A.I., Arapov, I.V., Gaissaryan, S.S, and Padaryan, V.A., ParJava Environment for Development of SPMD Programs for Homogeneous and Heterogeneous Networks JavaVM, Trans. Inst. System Programming RAS, 2000, vol. 2, pp. 27-48.
Lastovetsky, A.L., Kalinov, A.Ya., Ledovskikh, I.N., Arapov, D.M., and Posypkin, N.A., A Language and Programming System for High-Performance Parallel Computations on Heterogeneous Networks, Program-mirovanie, 2000, vol. 26, no. 4, pp. 55-80.
Avetisyan, A.I., Arapov, I.V., Gaisaryan, S.S., and Padaryan, V.A., The Environment for Development of Parallel Java Programs for Homogeneous and Heterogeneous Networks JavaVM, Proc. of All-Russian Sci. Conf. “High-Performance Computations and Their Applications,” Chernogolovka, 2000, pp. 46-50.
Garey, M. and Johnson, D.S., Computers and Intractability, San Francisco: Freeman, 1979. Translated under the title Vychislitel'nye mashiny i trudno reshaemye zadachi, Moscow: Mir, 1982.
Kwok, Y. and Ahmad, I., Parallel Program Scheduling Techniques, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, pp. 553-578.
Kwok, Y. and Ahmad, I., Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs onto Multiprocessors, IEEE Trans. Parallel Distributed Systems, 1996, vol. 7, no. 5, pp. 506-621.
Gasavant, T.L. and Kuhl, J.G., A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems, IEEE Trans. Software Eng., 1998, vol. 14, no. 2, pp. 141-154.
Zaki, J., Parthasarathy, S., and Weu, Li, Customized Dynamic Load Balancing, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, pp. 579-603.
Cortes, A., Ripoli, A., Senar, M.A., Cedo, F., and Luque, E., On the Stability of a Distributed Dynamic Load Balancing Algorithm, Proc. of the 1998 Int. Conf. on Parallel and Distributed Systems, Tainan, Taiwan, 1998, pp. 435-446.
Orlando, S. and Perego, R., A Template for Non-uniform Parallel Loops Based on Dynamic Scgeduling and Prefetching Techniques, Proc. of the 1996 ACM Int. Conf. on Supercomputing, 1996, Philadelphia.
Calder, B., Grunwald, D., Lindsay, D., Martin, J., Mozer, M., and Zorn, B., Corpus-based Static Branch Prediction, SIGPLAN Notices, 1995, no. 5, pp. 79-92.
Ortega, J.M., Introduction to Parallel and Vector Solution of Linear Systems, New York: Plenum, 1988. Translated under the title Vvedenie v parallel'nye i vectornye metody resheniya lineinykh sistem, Moscow: Mir, 1991.
SRCC MSU Server, http://www.parallel.ru
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Avetisyan, A.I., Gaisaryan, S.S. & Samovarov, O.I. Possibilities of Optimal Execution of Parallel Programs Containing Simple and Iterated Loops on Heterogeneous Parallel Computational Systems with Distributed Memory. Programming and Computer Software 28, 28–40 (2002). https://doi.org/10.1023/A:1013707600643
Issue Date:
DOI: https://doi.org/10.1023/A:1013707600643