skip to main content
10.1145/155090.155117acmconferencesArticle/Chapter ViewAbstractPublication PagespldiConference Proceedingsconference-collections
Article
Free Access

Balanced scheduling: instruction scheduling when memory latency is uncertain

Published:01 June 1993Publication History

ABSTRACT

Traditional list schedulers order instructions based on an optimistic estimate of the load delay imposed by the implementation. Therefore they cannot respond to variations in load latencies (due to cache hits or misses, congestion in the memory interconnect, etc.) and cannot easily be applied across different implementations. We have developed an alternative algorithm, known as balanced scheduling, that schedule instructions based on an estimate of the amount of instruction level parallelism in the program. Since scheduling decisions are program-rather than machine-based, balanced scheduling is unaffected by implementation changes. Since it is based on the amount of instruction level parallelism that a program can support, it can respond better to variations in load latencies. Performance improvements over a traditional list scheduler on a Fortran workload and simulating several different machine types (cache-based workstations, large parallel machines with a multipath interconnect and a combination, all with non-blocking processors) are quite good, averaging between 3% and 18%.

References

  1. 1.Anant Agarwal, Beng-Hong Lim, David Kranz, and John Kubiatowicz. APRIL: A processor architecture for multiprocessing. In Proceedings of the 17th Annual International Symposium on Computer Architecture, pages 104-114. IEEE, May 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. 2.Robert Alverson, David Cattahan, Daniel Cummings, Brian Koblenz, Allan Porterfield, and Burton Smith. The Tera Computer System. In 1990 International Conference on S~ercomputing, pages 1-6. SIGARCH, June 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. 3.ANS X3.9-1978. American National Standard Programming language FORTRAN. American National Standards Institute, New York, 1978.Google ScholarGoogle Scholar
  4. 4.M. Berry, D. Chen, D. Kuck, S. Lo, Y. Pang, L. Pointer, R. Roloff, A. Samah, E. Clementi, S. Chin, D. Schneider, G. Fox, E Messina, D. Walker, C. Hsiung, J. Schwarzmeier, K. Lue, S. Orszag, E Seidl, O. Johnson, R. Goodrum, and J. Martin. The perfect club: Effective performance evaluation of supercomputers. The International Journal of Supercomputer Applications, 3(3), Fall 1989.Google ScholarGoogle Scholar
  5. 5.Bradley Efron. The jackknife, the bootstrap, and other resampling plans. SiAM/CBMS-NSF Regional conference series in applied mathematics, volume 38, 1982.Google ScholarGoogle ScholarCross RefCross Ref
  6. 6.John R. Ellis. Bulldog: A Compiler for VLIW Architectures. ACM doctoral dissertation award; 1985. The MIT Press, 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. 7.S.I. Feldman, David M. Gay, Mark W. Maimone, and N. L Schryer. A Fortran-to-C converter. Computer Science Technical Report 149, AT&T Bell Laboratories, Murray Hill, NJ 07974, April 1991.Google ScholarGoogle Scholar
  8. 8.Phillip B. Gibbons and Steven S. Muchnick. Efficient instruction scheduling for a pipelined architecture. Proceedings of the SIG- PLAN 1986 Symposium on Compiler Construction, SIGPLAN Notices, 21 (7), July 1986. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. 9.John L. Hennessy and Thomas R. Gross. Code generation and reorganization in the presence of pipeline constraints. In Symposium on Principles of Programming Languages, pages 120-127, January 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. 10.John L. Hennessy and David A. Patterson. Computer Architecture: A Quantitative Approach. Morgan Kaufmann, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. 11.Mark Donald Hill. Aspects of Cache Memory andInstruction Buffer Performance. PhD thesis, University of California, Berkeley, November 1987.Google ScholarGoogle Scholar
  12. 12.Gerry Kane. mlps RISC Architecture. Prentice-HaU, 1988. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. 13.D. Kroft. Lockup-free instruction fetch/prefetch cache organizattion. In 8th Annual International Symposium on Computer Architecture, pages 81-87, 1981, Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. 14.E. Lawler, J. K. Lenstra, C. Martel, B. Simons, and L. Stockmeyer. Pipeline scheduling: A survey. Research Report RJ-5738, IBM, July t987.Google ScholarGoogle Scholar
  15. 15.Motorola. MC88100 RISC Microprocessor User's Manual. Prentice Hall, 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. 16.Krishna V. Palem and Barbara B. Simons. Scheduling time-critical instructions on RISC machines. In ACM Symposium on Principles of Programming Languages, January 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. 17.C. Scheurich and M. Dubois. Lockup-free caches in high-performance multiprocessors. Journal of Parallel and Distributed Processing, 11 (1):25-36, January 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. 18.G.S. Sohi and M. Franklin. High-bandwidth datamemory systems for superscalar processor. In Fourth International Conference on Archi. tectural Support for Programming Languages and Operating Systems (ASPLOS), pages 53-62, April 1991. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. 19.Richard Stallman. The GNU project optimizing C compiler. Free Software Foundation, Inc.Google ScholarGoogle Scholar
  20. 20.Robert Endre Tarjan. Data Structures and Network Algorithms, volume 44 of Regional Conference Series in Applied Mathematics. Society for Industrial and Applied Mathematics, 1983.Google ScholarGoogle Scholar
  21. 21.H. S. Warren, Jr. instruction scheduling for the IBM RISC System/6000 processor. IBM Journal of Research and Development, 34(1 ), January 1990. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. 22.Michael J. Woodard. Personal communication. Scheduling techniques used in Sun SPARC compilers, September 1992.Google ScholarGoogle Scholar

Index Terms

  1. Balanced scheduling: instruction scheduling when memory latency is uncertain

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image ACM Conferences
          PLDI '93: Proceedings of the ACM SIGPLAN 1993 conference on Programming language design and implementation
          August 1993
          313 pages
          ISBN:0897915984
          DOI:10.1145/155090

          Copyright © 1993 ACM

          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          • Published: 1 June 1993

          Permissions

          Request permissions about this article.

          Request Permissions

          Check for updates

          Qualifiers

          • Article

          Acceptance Rates

          Overall Acceptance Rate406of2,067submissions,20%

          Upcoming Conference

          PLDI '24

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader