Article

Free Access

Requirements for optimal execution of oops with tests

Author:
A. Uht

Univ. of California, La Jolla, CA

Univ. of California, La Jolla, CA
View Profile

ICS '88: Proceedings of the 2nd international conference on SupercomputingJune 1988Pages 230–237https://doi.org/10.1145/55364.55387

Published:01 June 1988Publication History

ICS '88: Proceedings of the 2nd international conference on Supercomputing

Pages 230–237

ABSTRACT

Both the efficient execution of branch intensive code and knowing the bounds on same are important issues in computing in general and supercomputing in particular. In prior work, it has been suggested, implied, or left as a possible maximum, that the hardware needed to execute code with branches optimally, i.e., oracular performance, is exponentially dependent on the total number of dynamic branches to be executed, this number of branches being proportional at least to the number of iterations of the loop. For classes of code taking at least one cycle per iteration to execute, this is not the case. For loops containing one test (normally in the form of a Boolean recurrence of order 1), it is shown that the hardware necessary varies from exponential to polynomial in the length of the dependency cycle L, while execution time varies from one time cycle per iteration to less than L time cycles per iteration; the variation depends on specific code dependencies.

References

1.Aiken, A. and Nicolau, A. Perfect Pipelining: A New Loop Parallelization Technique. In Proceedings of the 1988 European Symposium on Programming. , 1988. Also available as Dept. of Computer Science Technical Report Number 87-873, Comell University, Ithaca, N.Y. 14853. Google ScholarDigital Library
2.Baneajee, U. and Gajski, D. Fast Execution of Loops With IF Statements. IEEE Transactions on Computers C-33(11):1030-1033, November, 1984.Google Scholar
3.Cytron, R. G. Doacross: Beyond Vectorization for Multiproc~ssors (Extended Abstract). In Proceedings of the 1986 International Conference on Parallel Processing, pages 836-844. Pennsylvania State University and the IEEE Computer Society, August, 1986.Google Scholar
4.Ebcioglu, K. A Compilation Technique for Software Pipelining of Loops with Conditional Jumps. In Proceedings of the Twentieth Annual Workshop on Microprogramming (MICRO-20), pages 69-79. Association of Computing Machinery, December, 1987. Google ScholarDigital Library
5.Padua, D. A. and Wolfe, M. J. Advanced Compiler Optimizations for Supercomputers. Communications of the ACM 29(12):1184-1201, December, 1986. Google ScholarDigital Library
6.Polychronopoulos, C.D. On Program Restructuring, Scheduling, and Communication for Parallel Processor Systems. PhD thesis, University of illinois at Urbana-Champaign, August, 1986. Available as Center for Supercomputing Research and Development Tech. Report CSRD No. 595. Google ScholarDigital Library
7.Riseman, E. M. and Foster, C. C. The Inhibition of Potential Parallelism by Conditional Jumps. IEEE Transactions on Computers :1405-1411, December, 1972.Google Scholar
8.Su, B., Ding, S., Wang, I. and Xia, J. GURPR - A Method for Global Software Pipefining. In Proceedings of the Twentieth Annual Workshop on Microprogramming (MICRO-20), pages 88-96. Association of Computing Machinery, December, 1987. Google ScholarDigital Library
9.Tomasulo, R. M. An Efficient Algorithm for Expoiting Multiple Arithmetic Units. IBM Journal :25-33, january, 1967.Google Scholar
10.Uht, A. K. Hardware Extraction of Low-Level Concurrency from Sequential Instruction Streams. Phl) thesis, Carnegie-Mellon University, Pittsburgh, PA, December, 1985. Available from University Microfilms International, Ann Arbor, Michigan, U.S.A. Google ScholarDigital Library
11.Uht, A. K. and Wedig, R. G. Hardware Extraction of Low-level Concurrency from Serial Instruction Streams. In Proceedings of the International Conference on Parallel Processing, pages 729-736. IEEE Computer Society and the Association for Computing Machinery, August, 1986.Google Scholar
12.Oht, A. K. Incremental Performance Contributions of Hardware Concurrency Extraction Techniques. In Proceedings of the International Conference on Supercomputing, Athens, Greece. Computer Technology Institute, Greece, in cooImration with the Association for Computing Machinery, IFIP, et al, June, 1987. Springer-Verlag Lecture Note Series. in publication. Google ScholarDigital Library
13.Uht, A. K., Polychronopoulos, C. D., and Kolen, J. F. On tim Combination of Hardware and Software Concurrency Extraction Methods. In Proceedings of the Twentieth Annual Workshop on Microprogramming (MICRO-20), pages 133-141. Association of Computing Machinery, December, 1987. Google ScholarDigital Library

Index Terms

Requirements for optimal execution of oops with tests

Recommendations

Requirements for Optimal Execution of Loops with Tests

Both the efficient execution of branch intensive code and knowing the bounds on the same are important issues in computing in general and supercomputing in particular. In prior work, it has been suggested that the hardware needed to execute code with ...
Read More
A compiler optimization to reduce execution time of loop nest

In this paper, a compiler optimization to reduce the execution time of loop nest is proposed. Loop tiling is used to optimize loop nest. Loop tiling is the well-known optimization for improving locality. However, it has a count result that increases the ...
Read More
Transformations and efficient parallel execution of loops with dependencies
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ICS '88: Proceedings of the 2nd international conference on Supercomputing
June 1988
679 pages
ISBN:0897912721
DOI:10.1145/55364
Editor:
J. Lenfant
Rennes
Copyright © 1988 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 1 June 1988
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
Overall Acceptance Rate584of2,055submissions,28%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 185
  Total Downloads
- Downloads (Last 12 months)17
- Downloads (Last 6 weeks)3
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Requirements for optimal execution of oops with tests

ICS '88: Proceedings of the 2nd international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Requirements for Optimal Execution of Loops with Tests

A compiler optimization to reduce execution time of loop nest

Transformations and efficient parallel execution of loops with dependencies

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Requirements for optimal execution of oops with tests

ICS '88: Proceedings of the 2nd international conference on Supercomputing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Requirements for Optimal Execution of Loops with Tests

A compiler optimization to reduce execution time of loop nest

Transformations and efficient parallel execution of loops with dependencies

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media