Minimizing redundant dependencies and interprocessor synchronizations

Chao, Heng-Yi; Harper, Mary P.

doi:10.1007/BF02577868

Minimizing redundant dependencies and interprocessor synchronizations

Published: June 1995

Volume 23, pages 245–262, (1995)
Cite this article

International Journal of Parallel Programming Aims and scope Submit manuscript

Heng-Yi Chao¹ &
Mary P. Harper¹

50 Accesses
5 Citations
Explore all metrics

Abstract

Run-time synchronization overhead is a crucial factor in limiting speedup for parallel computers. In this paper, we present a new two-phase algorithm for removing redundant dependencies and minimizing interprocessor synchronizations when scheduling an acyclic task graph onto a multiprocessor system. The first phase removes redundant dependencies before scheduling; the second phase eliminates interprocessor synchronizations after scheduling. In a simulation using randomly generated task graphs, on the average, 98.28% of the dependencies are eliminated in the first phase, and 65.86% of the remaining dependencies are eliminated during the second phase, for a total of 99.41% removed. The approach has also been applied to some benchmark task graphs. The two-phase algorithm, which hasO(n ³) time complexity andO(n ²) space complexity, utilizes a new algorithm which computes the transitive closure and reduction at the same time, storing results in a single matrix.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

D. Kuck, E. Davidson, D. Lawrie, and A. Sameh, Parallel Supercomputing Today and the Cedar Approach,Science 231:967–974 (1986).
Article Google Scholar
A. Gottlieb, R. Grishman, C. P. Kruskal, K. P. McAuliffe, L. Rudolph, and M. Snir, The NYU Ultracomputer—Designing an MIMD Shared-memory and Parallel Machine,IEEE Trans. on Computers C-32:175–189 (1983).
Google Scholar
C. C. Price and M. A. Salama, Scheduling of Precedence-Constrained Tasks on Multiprocessors,The Computer Journal 33(3):219–229 (1990).
Article Google Scholar
E. Arnould, F. Bitz, E. Cooper, H. T. Kung, R. D. Sansom, and P. Steenkiste, The Design of Nectar: A Network Backplane for Heterogeneous Multicomputers,Proc. of the Third Int'l Conf. on Architectural Support for Progr. Languages and Oper. Syst., pp. 205–216 (1988).
W. C. Athas and C. L. Sietz, Multicomputers: Message-passing Concurrent Computers,IEEE Computer Magazine 46:9–24 (1988).
Google Scholar
F. W. Burton, G. P. McKeown, and V. J. Rayward-Smith, Applications of UET Scheduling Theory to the Implementation of Declarative Languages,The Computer Journal 33(4):330–336 (1990).
Article MathSciNet Google Scholar
H. Kasahara and S. Narita, Practical Multiprocessor Scheduling Algorithms for Efficient Parallel Processing,IEEE Trans. on Computers C-33(11):1023–1029 (1984).
Google Scholar
V. P. Krothapalli and P. Sadayappan, Removal of Redundant Dependences in DOACROSS Loops with Constant DependencesProc. of the Third SIGPLAN Symp. on Principles and Practice of Parallel Programming 26:51–60 (1991).
Article Google Scholar
Z. Li and W. Abu-Sufah, On Reducing Data Synchronization in Multiprocessed Loops,IEEE Trans. on Computers C-36(1):105–109 (1987).
Google Scholar
J. Blazewicz, M. Drabowski, and J. Weglarz, Scheduling Multiprocessor Tasks to Minimize Schedule Length,IEEE Trans. on Computers C-35(5):389–393 (1986).
MATH MathSciNet Google Scholar
E. G. Coffman and R. L. Graham, Optimal Scheduling for Two-Processor Systems,Acta. Informatica 1:200–213 (1972).
Article MathSciNet Google Scholar
H. N. Gabow, An Almost-Linear Algorithm for Two-Processor Scheduling,Journal of the ACM 29(3):207–227 (1982).
Article MathSciNet Google Scholar
T. C. Hu, Parallel Sequencing and Assembly Line Problems,Operations Research 9:841–848 (1961).
MathSciNet Google Scholar
H. F. Li, Scheduling Trees in Parallel/Pipelined Processing Environments,IEEE Trans. on Computers C-26(11):1101–1112 (1977).
MATH Google Scholar
V. F. Magirou and J. Z. Milis, An Algorithm for the Multiprocessor Assignment Problems,Oper. Res. Lett. 8(6):351–356 (1989).
Article MATH MathSciNet Google Scholar
S. P. Midkiff and D. A. Padua, Compiler Algorithms for Synchronizations,IEEE Trans. on Computers C-36(12):1485–1495 (1987).
Article MATH Google Scholar
P. L. Shaffer, Minimization of Interprocessor Synchronization in Multiprocessors with Shared and Private Memory,Int'l Conf. on Parallel Processing 3:138–142 (1989).
Google Scholar
D. Bernstein, An Improved Approximation Algorithm for Scheduling Pipelined Machines,Int'l Conf. on Parallel Processing, pp. 430–433 (1988).
T.R. Gross, Code Optimization Techniques for Pipelined Architectures,COMPCON ′83, Spring, pp, 278–285 (1983).
H. Y. Chao and M. P. Harper, Scheduling a Superscalar Pipelined Processor Without Hardware Interlocks, Technical Report TR-EE 94-29, Purdue University (1994).
Y. H. Shiau and C. P. Chung, Adoptability and Effectiveness of Microcode Compaction Algorithms in Superscalar Processing,Parallel Computing 18(5):497–510 (1992).
Article MATH Google Scholar
T. Yoshimura and E. S. Kuh, Efficient Algorithms for Channel Routing,IEEE Trans. on Computer-Aided Design of Integrated Circuits and Systems CAD-1:25–35 (1982).
Article Google Scholar
A. V. Aho, J. E. Hopcroft and J. D. Ullman,The Design and Analysis of Computer Algorithms, Addison-Wesley Publishing Company, San Francisco, California (1976).
Google Scholar
S. Baase,Computer Algorithms, Addison-Wesley Publishing Company, San Diego, California (1988).
Google Scholar
T. H. Cormen, C. E. Leiserson, and R. L. Rivest,Introduction to Algorithms, McGraw-Hill Book Company, New York (1990).
MATH Google Scholar
M. R. Garey and D. S. Johnson,Computers and Intractability, W. H. Freeman and Company, San Francisco, California (1979).
MATH Google Scholar
D. Gries, A. J. Martin, Jan L. A. van de Snepscheut, and J. T. Udding, An Algorithm for Transitive Reduction of an Acyclic Graph,Science of Computer Programming 12:151–155 (1989).
Article MATH MathSciNet Google Scholar
C. H. Papadimitriou and K. Steiglitz,Combinatorial Optimization: Algorithms and Complexity, Prentice-Hall Inc., San Francisco, California (1979).
Google Scholar
K. K. Lee and H. W. Leong, An Improved Lower Bound for Channel Routing Problems,IEEE Int'l Symp. on Circuits and Systems 4: 11–14 (1991).
Google Scholar
J. S. Wang and R. C. T. Lee, An Efficient Channel Routing Algorithm to Yield an Optimal Solution,IEEE Trans. on Computers 39(7):957–962 (1990).
Article Google Scholar
T. L. Adam, K. M. Chandy, and J. R. Dickson, A Comparison of List Schedules for Parallel Processing Systems,Comm. ACM 17(12):685–690 (1974).
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical Engineering, Purdue University, 1285 Electrical Engineering Building, 47907-1285, West Lafayette, Indiana
Heng-Yi Chao & Mary P. Harper

Authors

Heng-Yi Chao
View author publications
You can also search for this author in PubMed Google Scholar
Mary P. Harper
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chao, HY., Harper, M.P. Minimizing redundant dependencies and interprocessor synchronizations. Int J Parallel Prog 23, 245–262 (1995). https://doi.org/10.1007/BF02577868

Download citation

Received: 31 May 1994
Issue Date: June 1995
DOI: https://doi.org/10.1007/BF02577868

Key Words

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Minimizing redundant dependencies and interprocessor synchronizations

Abstract

Access this article

Similar content being viewed by others

ECP: a novel clustering-based technique to schedule precedence constrained tasks on multiprocessor computing systems

Energy-Efficient and Fault-Tolerant Taskgraph Scheduling for Manycores and Grids

Trade-Off Between Performance, Fault Tolerance and Energy Consumption in Duplication-Based Taskgraph Scheduling

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Key Words

Navigation

Minimizing redundant dependencies and interprocessor synchronizations

Abstract

Access this article

Similar content being viewed by others

ECP: a novel clustering-based technique to schedule precedence constrained tasks on multiprocessor computing systems

Energy-Efficient and Fault-Tolerant Taskgraph Scheduling for Manycores and Grids

Trade-Off Between Performance, Fault Tolerance and Energy Consumption in Duplication-Based Taskgraph Scheduling

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Key Words

Search

Navigation