Skip to main content
Log in

Toward auto-scheduling compilers

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In this paper we propose a general framework for compiling, scheduling, and executing parallel programs on parallel computers. We discuss important aspects of program partitioning, scheduling, and execution, and consider possible realistic alternatives for each issue. Subsequently we propose a possible implementation of an auto-scheduling compiler and give simple examples to illustrate the principles. Our approach to the entire problem is to utilize program information available to the compiler while, at the same time, allowing for run-time “corrections” and flexibility.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Aho, A. V., Sethi, R., and Ullman, J. D. 1986. Compilers: Principles, Techniques, and Tools. Addison Wesley, Reading, Mass.

    Google Scholar 

  • Allen, F. E. and Cocke, J. 1972. A catalogue of optimizing transformations. In Design and Optimization of Compilers (R. Rustin, ed.), Prentice-Hall, Englewood Cliffs, N.J., pp. 1–30.

    Google Scholar 

  • Allen, J. R., and Kennedy, K. 1982. PFC: A program to convert Fortran to parallel form. Tech. Rept. MASC-TR82–6, Rice University, Houston, Texas.

    Google Scholar 

  • Allen R., and Kennedy, K. 1987. Automatic translation of FORTRAN programs to vector form. ACM Transactions on Programming Languages and Systems, 9(4).

  • Alliant Computer Systems Corp. 1985. FX/Series Architecture Manual. Acton, Mass.

    Google Scholar 

  • American National Standards Institute. 1981. American National Standard for Information Systems. Programming Language Fortran S8 (X3.9–198x). Revision of X3.9–1978, Draft S8, Version 99, ANSI, New York.

    Google Scholar 

  • Banerjee, U. 1979. Speedup of ordinary programs. Ph.D. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCDCS-R-79–989.

    Google Scholar 

  • Bokhari, S. 1988. Partitioning problems in parallel, pipelined and distributed computing. IEEE Transactions on Computers, 37(1).

  • Chen, S. 1983. Large-scale and high-speed multiprocessor system for scientific applications—Cray-X-MP-2 series. Proc. of NATO Advanced Research Workshop on High Speed Computing, Kowalik (ed.), pp.59–67.

  • Coffman, E. G. Jr., ed. 1976. Computer and Job-Shop Scheduling Theory. Wiley, New York, 1976.

    Google Scholar 

  • Coffman. E. G., and Graham R. L. 1972. Optimal scheduling on two processor systems. Acta Informatica, 1(3).

  • Cray, 1985. Multitasking user guide. Cray Computer Systems Technical Note, SN-0222, January 1985.

  • Cytron, R. G. 1986. Doacross: Beyond vectorization for multiprocessors (extended abstract). In Proceedings of the 1986 International Conference on Parallel Processing, St. Charles, Ill., pp.836–844.

  • Garey, M. R., and Johnson, D. S. 1979. Computers and Intractability: A Guide to the Theory of NP-Completeness. Freeman, San Francisco.

    Google Scholar 

  • Gajski, D., Padua, D., Kuck, D., and Kuhn, R. 1982. A second opinion on dataflow machines and languages, IEEE Computer, Feb.

  • Girkar, M., and Polychronopoulos, C. 1988. Compiler issues for supercomputers. Technical Report CSRD No. 676, Center for Supercomputing Research and Development, University of Illinois.

  • Gottlieb, A., Grishman, R., Kruskal, C. P., McAuliffe, K. P., Rudolph, L., and Snir, M. 1983. The NYU Ultracomputer—Designing an MIMD shared-memory parallel machine. IEEE Trans. Computers, C-32,(2).

  • Graham, R. L. 1972. Bounds on multiprocessor scheduling anomalies and related packing algorithms. Proceedings of Spring Joint Computer Conference.

  • Guzzi, M., Padua, D., and Lawrie, D. H. 1988. Cedar Fortran. Internal Document, Center for Supercomputing Research and Development, University of Illinois.

  • Kennedy, K. 1980. Automatic vectorization of Fortran programs to vector form. Technical Report, Rice University, Houston, Texas.

    Google Scholar 

  • Kuck, D. J., Kuhn, R. H., Leasure, B., and Wolfe, M. 1980. The structure of an advanced vectorizer for pipelined processors. Fourth International Computer Software and Applications Conference.

  • Kuck, D. J., Davidson, E. S., Lawrie, D. H., and Sameh, A. H. 1986. Parallel supercomputing today and the Cedar approach. Science231 (4740): 967–974.

    Google Scholar 

  • Kuck, D. J., Kuhn, R., Padua, D., Leasure, B., and Wolfe, M. 1981. Dependence graphs and compiler optimizations. In Proceedings of the 8th ACM Symposium on Principles of Programming Languages, pp. 207–218.

  • Mehrotra, P., and Van Rosendale, J. 1985. The Blaze language: A parallel language for scientific programming. Rep. 85–29, Institute for Computer Applications in Science and Engineering, NASA Langley Research Center, Hampton, Va.

    Google Scholar 

  • Midkiff S. P., and Padua, D. A. 1987. Compiler algorithms for synchronization. IEEE Transactions on Computers, 36, 12, December.

    Google Scholar 

  • Miura, K., and Uchida, K. 1984. Facom vector processor VP-100/VP-200. High Speed Computation, NATO ASI Series, Vol. F7 (J. S. Kowalik, ed.), Springer-Verlag, New York.

    Google Scholar 

  • Nicolau, A. 1984. Parallelism, memory anti-aliasing and correctness for trace scheduling compilers. Ph.D. Thesis, Yale University, New Haven, Conn.

    Google Scholar 

  • Padua, D; A. and Wolfe, M. 1986. Advanced compiler optimizations for supercomputers. Communications of the ACM, 29(12): 1184–1201.

    Google Scholar 

  • Padua, D. A., Haiek, . ., Kuck, D. J., Lawrie, D. H., 1980. High-speed multiprocessors and compilation techniques. IEEE Transactions on Computers, C-29(9).

  • Polychronopoulos, C. D. 1986. On program restructuring, scheduling, and communication for parallel processor systems. Ph.D. Thesis, CSRD No. 595, Center for Supercomputing Research and Development, University of Illinois.

  • Polychronopoulos, C. D. 1987. Loop coalescing: A compiler transformation for parallel machines. Proceedings of the 1987 International Conference on Parallel Processing (St. Charles, Ill., Aug, 1987).

  • Polychronopoulos, C. D. 1988a. Compiler optimizations for enhancing parallelism and their impact on architecture design. IEEE Transactions on Computers, C-37(8).

  • Polychronopoulos, C. D. 1988b. The impact of run-time overhead on usable parallelism. To appear in the Proceedings of the 1988 International Conference on Parallel Processing (St Charles, Ill. August 15–19, 1988).

  • Polychronopoulos, C. D., and Banerjee, U. 1987. Processor allocation for horizontal and vertical parallelism and related speedup bounds. IEEE Transactions on Computers, C-36(4).

  • Polychronopoulos, C. D., and Kuck, D. J. 1987. Guided self-scheduling: A practical scheduling scheme for parallel supercomputers. IEEE Transactions on Computers, C-36(12).

  • Sahni, S. 1984. Scheduling multipipeline and multiprocessor computers. IEEE Trans. Comput., C-33(7).

  • Schwan, K., and Gaimon, C. 1985. Automatic resource allocation for the Cm* multiprocessor. In Proc. of the 1985 International Conference on Distributed Computing Systems.

  • Smith, B. 1981. Architecture and applications of the HEP multiprocessor computer system. Real Time Processing IV, Proc. of SPIE, pp.241–248.

  • Stone, H. S. 1977. Multiprocessor scheduling with the aid of network flow algorithms. IEEE Trans. Software Engineering, SE-3(1).

    Google Scholar 

  • Tang, P., and Yew, P. C. 1986. Processor self-scheduling for multiple-nested parallel loops. Proceedings of the 1986 International Conference on Parallel Processing.

  • Veidenbaum, A. V. 1985. Compiler optimizations and architecture design issues for multiprocessors. Ph.D. Thesis, UIUC-DCS-85-8012, CSRD, University of Illinois.

  • Wolfe, M. J. 1982. Optimizing supercompilers for supercomputers. Ph.D. Thesis, University of Illinois at Urbana-Champaign, DCS Report No. UIUCCDCS-R-82–1105.

    Google Scholar 

  • Zhu, C. Q., and Yew, P. C. 1984. A synchronization scheme and its applications for large multiprocessor systems. In Proc. of the 1984 International Conference on Distributed Computing Systems, pp.486–493.

Download references

Author information

Authors and Affiliations

Authors

Additional information

This work was supported in part by the National Science Foundation under Grant NSF MIP-8410110, the U.S. Department of Energy under Grant DE-FG02-85ER25001, an IBM donation and a grant from AT&T.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Polychronopoulos, C.D. Toward auto-scheduling compilers. J Supercomput 2, 297–330 (1988). https://doi.org/10.1007/BF00129782

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF00129782

Keywords

Navigation