Skip to main content
Log in

Possibilities of Optimal Execution of Parallel Programs Containing Simple and Iterated Loops on Heterogeneous Parallel Computational Systems with Distributed Memory

  • Published:
Programming and Computer Software Aims and scope Submit manuscript

Abstract

The problem of load balancing when executing parallel programs on computational systems with distributed memory is currently of great interest. The most general statement of this problem is that for one parallel loop: execution of a heterogeneous loop on a heterogeneous computational system. When stated in this way, the problem is NP-complete even in the case of two nodes, and no acceptable heuristics for solving it are found. Since the development of heuristics is a rather complicated task, we decided to examine the problem by elementary methods in order to refine (and, possibly, simplify) the original problem statement. The results of our studies are discussed in this paper. Estimates of efficiency of parallel loop execution as functions of the number of nodes of homogeneous and heterogeneous parallel computational systems are obtained. These estimates show that the use of heterogeneous parallel systems reduces the efficiency even in the case when their communication subsystems are scaleable (see the definition in Section 4). The use of local networks (heterogeneous parallel computational systems with nonscaleable communication subsystems) for parallel computations with heavy data exchange is not advantageous and is possible only for a small number of nodes (about five). An algorithm of optimal distribution of data between the nodes of a homogeneous or heterogeneous computational system is suggested. Results of numerical experiments substantiate the conclusions obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

REFERENCES

  1. Schikuta, E. and Stockinger, H., Parallel Input/Output for Clusters: Methodologies and Systems, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, p. 442.

    Google Scholar 

  2. Avetisyan, A.I., Arapov, I.V., Gaissaryan, S.S, and Padaryan, V.A., ParJava Environment for Development of SPMD Programs for Homogeneous and Heterogeneous Networks JavaVM, Trans. Inst. System Programming RAS, 2000, vol. 2, pp. 27-48.

    Google Scholar 

  3. Lastovetsky, A.L., Kalinov, A.Ya., Ledovskikh, I.N., Arapov, D.M., and Posypkin, N.A., A Language and Programming System for High-Performance Parallel Computations on Heterogeneous Networks, Program-mirovanie, 2000, vol. 26, no. 4, pp. 55-80.

    Google Scholar 

  4. Avetisyan, A.I., Arapov, I.V., Gaisaryan, S.S., and Padaryan, V.A., The Environment for Development of Parallel Java Programs for Homogeneous and Heterogeneous Networks JavaVM, Proc. of All-Russian Sci. Conf. “High-Performance Computations and Their Applications,” Chernogolovka, 2000, pp. 46-50.

  5. Garey, M. and Johnson, D.S., Computers and Intractability, San Francisco: Freeman, 1979. Translated under the title Vychislitel'nye mashiny i trudno reshaemye zadachi, Moscow: Mir, 1982.

    Google Scholar 

  6. Kwok, Y. and Ahmad, I., Parallel Program Scheduling Techniques, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, pp. 553-578.

    Google Scholar 

  7. Kwok, Y. and Ahmad, I., Dynamic Critical-Path Scheduling: An Effective Technique for Allocating Task Graphs onto Multiprocessors, IEEE Trans. Parallel Distributed Systems, 1996, vol. 7, no. 5, pp. 506-621.

    Google Scholar 

  8. Gasavant, T.L. and Kuhl, J.G., A Taxonomy of Scheduling in General-Purpose Distributed Computing Systems, IEEE Trans. Software Eng., 1998, vol. 14, no. 2, pp. 141-154.

    Google Scholar 

  9. Zaki, J., Parthasarathy, S., and Weu, Li, Customized Dynamic Load Balancing, High Performance Cluster Computing, Buyya, R., Ed., New Jersey: Prentice Hall, 1999, vol. 1, pp. 579-603.

    Google Scholar 

  10. Cortes, A., Ripoli, A., Senar, M.A., Cedo, F., and Luque, E., On the Stability of a Distributed Dynamic Load Balancing Algorithm, Proc. of the 1998 Int. Conf. on Parallel and Distributed Systems, Tainan, Taiwan, 1998, pp. 435-446.

  11. Orlando, S. and Perego, R., A Template for Non-uniform Parallel Loops Based on Dynamic Scgeduling and Prefetching Techniques, Proc. of the 1996 ACM Int. Conf. on Supercomputing, 1996, Philadelphia.

  12. Calder, B., Grunwald, D., Lindsay, D., Martin, J., Mozer, M., and Zorn, B., Corpus-based Static Branch Prediction, SIGPLAN Notices, 1995, no. 5, pp. 79-92.

  13. Ortega, J.M., Introduction to Parallel and Vector Solution of Linear Systems, New York: Plenum, 1988. Translated under the title Vvedenie v parallel'nye i vectornye metody resheniya lineinykh sistem, Moscow: Mir, 1991.

    Google Scholar 

  14. SRCC MSU Server, http://www.parallel.ru

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Avetisyan, A.I., Gaisaryan, S.S. & Samovarov, O.I. Possibilities of Optimal Execution of Parallel Programs Containing Simple and Iterated Loops on Heterogeneous Parallel Computational Systems with Distributed Memory. Programming and Computer Software 28, 28–40 (2002). https://doi.org/10.1023/A:1013707600643

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1013707600643

Keywords

Navigation