Skip to main content
Log in

Minimizing redundant dependencies and interprocessor synchronizations

  • Published:
International Journal of Parallel Programming Aims and scope Submit manuscript

Abstract

Run-time synchronization overhead is a crucial factor in limiting speedup for parallel computers. In this paper, we present a new two-phase algorithm for removing redundant dependencies and minimizing interprocessor synchronizations when scheduling an acyclic task graph onto a multiprocessor system. The first phase removes redundant dependencies before scheduling; the second phase eliminates interprocessor synchronizations after scheduling. In a simulation using randomly generated task graphs, on the average, 98.28% of the dependencies are eliminated in the first phase, and 65.86% of the remaining dependencies are eliminated during the second phase, for a total of 99.41% removed. The approach has also been applied to some benchmark task graphs. The two-phase algorithm, which hasO(n 3) time complexity andO(n 2) space complexity, utilizes a new algorithm which computes the transitive closure and reduction at the same time, storing results in a single matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  1. D. Kuck, E. Davidson, D. Lawrie, and A. Sameh, Parallel Supercomputing Today and the Cedar Approach,Science 231:967–974 (1986).

    Article  Google Scholar 

  2. A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, The NYU Ultracomputer—Designing an MIMD Shared-memory and Parallel Machine,IEEE Trans. on Computers C-32:175–189 (1983).

    Google Scholar 

  3. C. C. Price and M. A. Salama, Scheduling of Precedence-Constrained Tasks on Multiprocessors,The Computer Journal 33(3):219–229 (1990).

    Article  Google Scholar 

  4. E. Arnould, F. Bitz, E. Cooper, H. T. Kung, R. D. Sansom, and P. Steenkiste, The Design of Nectar: A Network Backplane for Heterogeneous Multicomputers,Proc. of the Third Int'l Conf. on Architectural Support for Progr. Languages and Oper. Syst., pp. 205–216 (1988).

  5. W. C. Athas and C. L. Sietz, Multicomputers: Message-passing Concurrent Computers,IEEE Computer Magazine 46:9–24 (1988).

    Google Scholar 

  6. F. W. Burton, G. P. McKeown, and V. J. Rayward-Smith, Applications of UET Scheduling Theory to the Implementation of Declarative Languages,The Computer Journal 33(4):330–336 (1990).

    Article  MathSciNet  Google Scholar 

  7. H. Kasahara and S. Narita, Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing,IEEE Trans. on Computers C-33(11):1023–1029 (1984).

    Google Scholar 

  8. V. P. Krothapalli and P. Sadayappan, Removal of Redundant Dependences in DOACROSS Loops with Constant DependencesProc. of the Third SIGPLAN Symp. on Principles and Practice of Parallel Programming 26:51–60 (1991).

    Article  Google Scholar 

  9. Z. Li and W. Abu-Sufah, On Reducing Data Synchronization in Multiprocessed Loops,IEEE Trans. on Computers C-36(1):105–109 (1987).

    Google Scholar 

  10. J. Blazewicz, M. Drabowski, and J. Weglarz, Scheduling Multiprocessor Tasks to Minimize Schedule Length,IEEE Trans. on Computers C-35(5):389–393 (1986).

    MATH  MathSciNet  Google Scholar 

  11. E. G. Coffman and R. L. Graham, Optimal Scheduling for Two-Processor Systems,Acta. Informatica 1:200–213 (1972).

    Article  MathSciNet  Google Scholar 

  12. H. N. Gabow, An Almost-Linear Algorithm for Two-Processor Scheduling,Journal of the ACM 29(3):207–227 (1982).

    Article  MathSciNet  Google Scholar 

  13. T. C. Hu, Parallel Sequencing and Assembly Line Problems,Operations Research 9:841–848 (1961).

    MathSciNet  Google Scholar 

  14. H. F. Li, Scheduling Trees in Parallel/Pipelined Processing Environments,IEEE Trans. on Computers C-26(11):1101–1112 (1977).

    MATH  Google Scholar 

  15. V. F. Magirou and J. Z. Milis, An Algorithm for the Multiprocessor Assignment Problems,Oper. Res. Lett. 8(6):351–356 (1989).

    Article  MATH  MathSciNet  Google Scholar 

  16. S. P. Midkiff and D. A. Padua, Compiler Algorithms for Synchronizations,IEEE Trans. on Computers C-36(12):1485–1495 (1987).

    Article  MATH  Google Scholar 

  17. P. L. Shaffer, Minimization of Interprocessor Synchronization in Multiprocessors with Shared and Private Memory,Int'l Conf. on Parallel Processing 3:138–142 (1989).

    Google Scholar 

  18. D. Bernstein, An Improved Approximation Algorithm for Scheduling Pipelined Machines,Int'l Conf. on Parallel Processing, pp. 430–433 (1988).

  19. T.R. Gross, Code Optimization Techniques for Pipelined Architectures,COMPCON ′83, Spring, pp, 278–285 (1983).

  20. H. Y. Chao and M. P. Harper, Scheduling a Superscalar Pipelined Processor Without Hardware Interlocks, Technical Report TR-EE 94-29, Purdue University (1994).

  21. Y. H. Shiau and C. P. Chung, Adoptability and Effectiveness of Microcode Compaction Algorithms in Superscalar Processing,Parallel Computing 18(5):497–510 (1992).

    Article  MATH  Google Scholar 

  22. T. Yoshimura and E. S. Kuh, Efficient Algorithms for Channel Routing,IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems CAD-1:25–35 (1982).

    Article  Google Scholar 

  23. A. V. Aho, J. E. Hopcroft and J. D. Ullman,The Design and Analysis of Computer Algorithms, Addison-Wesley Publishing Company, San Francisco, California (1976).

    Google Scholar 

  24. S. Baase,Computer Algorithms, Addison-Wesley Publishing Company, San Diego, California (1988).

    Google Scholar 

  25. T. H. Cormen, C. E. Leiserson, and R. L. Rivest,Introduction to Algorithms, McGraw-Hill Book Company, New York (1990).

    MATH  Google Scholar 

  26. M. R. Garey and D. S. Johnson,Computers and Intractability, W. H. Freeman and Company, San Francisco, California (1979).

    MATH  Google Scholar 

  27. D. Gries, A. J. Martin, Jan L. A. van de Snepscheut, and J. T. Udding, An Algorithm for Transitive Reduction of an Acyclic Graph,Science of Computer Programming 12:151–155 (1989).

    Article  MATH  MathSciNet  Google Scholar 

  28. C. H. Papadimitriou and K. Steiglitz,Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall Inc., San Francisco, California (1979).

    Google Scholar 

  29. K. K. Lee and H. W. Leong, An Improved Lower Bound for Channel Routing Problems,IEEE Int'l Symp. on Circuits and Systems 4: 11–14 (1991).

    Google Scholar 

  30. J. S. Wang and R. C. T. Lee, An Efficient Channel Routing Algorithm to Yield an Optimal Solution,IEEE Trans. on Computers 39(7):957–962 (1990).

    Article  Google Scholar 

  31. T. L. Adam, K. M. Chandy, and J. R. Dickson, A Comparison of List Schedules for Parallel Processing Systems,Comm. ACM 17(12):685–690 (1974).

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chao, HY., Harper, M.P. Minimizing redundant dependencies and interprocessor synchronizations. Int J Parallel Prog 23, 245–262 (1995). https://doi.org/10.1007/BF02577868

Download citation

  • Received:

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02577868

Key Words

Navigation