Abstract
It is relatively clear how to map regular, repetitive or grid oriented computations onto SIMD architectures. It is not so clear, however, how to do this for irregular computations even though there may be significant amount of intrinsic parallelism in branch free code. We study compilation techniques for this type of code when targeted to SIMD computer and illustrate their use on a simple model architecture.
In this paper, we present one of the compilation techniques,global register allocation, we have developed for SIMD computers, and demonstrate that it can effectively allocate registers for parallelizing irregular computations in branch free code. This technique is an extension and a modification of the register allocations via graph coloring approach used by sequential compilers. Our performance results validate our method.
Similar content being viewed by others
References
Subhlok J, Stichnoth J M, O’Hallaron D R, Gross T. Exploiting task and data parallelism on a multicomputer. InProc. of 4th SIGPLAN Symp. on Principles and Practice of Parallel Programming PPOPP, May 1993, pp. 13–22.
Zima H, Chapman B. Supercompilers for Parallel and Vector computers. ACM Press, Addison Wesley, 1992, pp. 50–57.
Hillis D W, Steele Jr. G L. Data parallel algorithms.Communications of the ACM, 1986, 29(12): 1170–1183.
Zima H, Bast H-J, Gerndt M. SUPERB: A tool for semi-automatic MIMD/SIMD parallelization.Parallel Computing, 1988, 6: 1–18.
Chatterjee S, Gilbert J R, Long F J E, Schreiber R, Teng S-H. Generating local addresses and communication sets for data-parallel programs. InProc. of 4th SIGPLAN Symp. on Principles and Practice of Parallel Programming PPOPP, May 1993, pp. 149–158.
Chow F C, Hennessy J L. Register allocation by priority based coloring. InProc. of the ACM SIGPLAN’84 Symp. on Compiler Construction, also inSIGPLAN Notices, 1984, 19 (6).
Chaitin G J, Auslander M A, Chandra A K, Cocke J, Hopkins M E, Markstein P W. Register allocation via coloring.Computer Languages, 1981, 8: 47–57.
Chaitin G J. Register allocation and spilling via graph coloring. InProc. of the ACM SIGPLAN’82 Symp. on compiler Construction; also inSIGPLAN Notices, 1982, 17(6): 98–105.
Chow F C, Hennessy J L. The priority-based coloring approach to register allocation.ACM Trans. on Programming Languages and Systems, 1990, 12(4): 501–536.
Connection Machine CM-200 Technical Summaries. Thinking Machine Cooperation, 1991.
Fisher J A. Very long instruction word architectures and the ELI-512. InProc. of 10th Annual Symp. on Computer Architecture, Stockholm, June 1983, pp. 140–150.
Rau B R, Yen D W L, Yen W, Towle R A. The Cydra 5 department supercomputer: Design philosophies, decisions and trade-offs.Computer, 1989 22(1).
Anderson D W, Sparacio F J, Tomasulo R M. The System/360 Model 91: Machine philosophy and instruction handling.IBM Journal of Research and Development, 1967, 11(1): 8–24.
Diefendorff K, Allen M. Organization of the Motorola 88110 superscalar RISC microprocessor.IEEE Micro, 1992, 12(2): 40–63.
Foster I, Kesselman C, Taylor S. Concurrency: Simple concepts and powerful tools.The Computer Journal, Dec. 1990.
Jouppi N P, Wall D W. Available instruction-level parallelism for superscalar and superpipelined machines. In3rd Int’l Symp. on Architectural Support for Programming Languages and Operating Systems, April 1989, pp. 272–282.
Smith M D, Johnson M, Horowitz M A. Limits on multiple instruction issue. In3rd Int’l Symp. on Architectural Support for Programming Languages and Operating Systems, April 1989, pp. 290–302.
Wall D W. Limits of instruction-level parallelism. In4th Int’l Symp. on Architectural Support for Programming Languages and Operating Systems, April 1991, pp. 176–188.
Wang C C. An algorithm for the chromatic number of a graph.Journal of ACM, 1974, 21(177): 385–391.
Johnson R, Pingali K. Dependence-based program analysis. InProc. of the ACM SIGPLAN’93 Conf. on Programming Language Design and Implementation, June 1993, pp. 78–89.
Author information
Authors and Affiliations
Additional information
Research supported in part by the Advanced Reseach Projects Agency of the Department of Defense under ONR Contract N00014-92-J-1989, by ONR Contract N0014-92-J-1839, United States-Israel Binational science Foundation Grant 92-00234 and in part by the U.S. Army Research Office through the Mathematical Science Institute of Cornell University.
Benjamin HAO received his Ph.D. degree from the Computer Science Department of Cornell University. He received his B.S. degree from the University of California at Berkeley in electrical engineering and computer science. Mr. Hao worked as a technical staff member for Sun Microsystem’s advanced development group from 1988 to 1991. His research interests include parallel computing distributed computing, computer hardware design, and multimedia.
David PEARSON was born in Medina, NY on December 3, 1954. He received his A.B. degree from Dartmouth College in 1975 and is currently pursuing a Ph.D. degree in computer science at Cornell University. Mr. Pearson worked as a system programmer for Data General, was a network designer for Dartmouth, and helped found True Basic, Inc. where he served as the Vice-President of R&D from 1983 to 1988. His research interests include parallel computing and the theory of algorithms.
Richard ZIPPEL received his Ph.D. from MIT for research in symbolic computatation and randomized algorithms. During this period his was one of the main authors of the symbolic computing system Macsyma. After joining the faculty at MIT he lead a group doing research in VLSI design, VLSI CAD and computer architecture. Among the fruits of this research were the database accelerator architecture and the first university level course in memory design. He then joined Symbolics, Inc. as a Technical Director and lead their parallel computing effort. Since joining the Computer Science Department at Cornell University, he has been doing research in programming languages, symbolic computation and collaborative engineering.
Rights and permissions
About this article
Cite this article
Hao, B., Pearson, D. & Zippel, R. Global register allocation for SIMD multiprocessors. J. of Comput. Sci. & Technol. 11, 222–236 (1996). https://doi.org/10.1007/BF02943131
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02943131