Abstract
Demand for instruction level parallelism calls for increasing register bandwidth without increasing the number of register ports. Emerging architectures address this need by partitioning registers into multiple distributed banks, which offers a technology scalable substrate but a challenging compilation target. This paper introduces a register allocator for spatially partitioned architectures. The allocator performs bank assignment together with allocation. It minimizes spill code and optimizes bank selection based on a priority function. This algorithm is unique because it must reason about multiple competing resource constraints and dependencies exposed by these architectures. We demonstrate an algorithm that uses critical path estimation, delays from registers to consuming functional units, and hardware resource constraints. We evaluate the algorithm on TRIPS, a functional, partitioned, tiled processor with register banks distributed on top of a 4 ×4 grid of ALUs. These results show that the priority banking algorithm implements a number of policies that improve performance, performance is sensitive to bank assignment, and the compiler manages this resource well.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bernstein, D., Golumbic, M., Mansour, Y., Pinter, R., Goldin, D., Nahshon, I., Krawczyk, H.: Spill code minimization techniques for optimizing compliers. In: ACM SIGPLAN Symposium on Interpreters and Interpretive Techniques, pp. 258–263 (1989)
Brasier, T.S., Sweany, P.H., Beaty, S.J., Carr, S.: CRAIG: a practical framework for combining instruction scheduling and register assignment. In: Parallel Architectures and Compilation Techniques, pp. 11–18 (1995)
Briggs, P., Cooper, K.D., Torczon, L.: Improvements to graph coloring register allocation. In: ACM Transactions on Programming Languages and Systems, May 1994, vol. 16(3), pp. 428–455 (1994)
Chaitin, G.: Register allocation and spilling via graph coloring. In: ACM SIGPLAN Symposium on Compiler Construction, pp. 98–105 (1982)
Chow, F.C., Hennessy, J.L.: Priority-based coloring approach to register allocation. In: ACM Transactions on Programming Languages and Systems, vol. 12, pp. 501–536 (1990)
Coons, K., Chen, X., Kushwaha, S., Burger, D., McKinley, K.S.: A spatial path scheduling algorithm for edge architectures. In: ACM Conference on Architecture Support for Programming Languages and Operating Systems, pp. 129–140 (2006)
EEMBC. Embedded microprocessor benchmark consortium, http://www.eembc.org/
Ellis, J.: A Compiler for VLIW Architecture. PhD thesis, Yale University (1984)
Farkas, K.L., Chow, P., Jouppi, N.P., Vranesic, Z.: The multicluster architecture: Reducing processor cycle time through partitioning. In: ACM/IEEE Symposium on Microarchitecture, pp. 327–356 (1997)
Hiser, J., Carr, S., Sweany, P., Beaty, S.J.: Register assignment for software pipelining with partitioned register banks. In: International Parallel and Distributed Processing Symposium, pp. 211–217 (2000)
Janssen, J., Corporaal, H.: Partitioned register files for TTAs. In: ACM/IEEE Symposium on Micorarchitecture, December 1995, pp. 301–312 (1995)
Kailas, K., Ebcioglu, K., Agrawala, A.: Cars: A new code generation framework for clustered ILP processors. In: Conference on High Performance Computer Architecture, pp. 133–143 (2001)
Maher, B., Smith, A., Burger, D., McKinley, K.S.: Merging head and tail duplication for convergent hyperblock formation. In: ACM/IEEE International Symposium on Microarchitecture, pp. 65–76 (2006)
Poletto, M., Sarkar, V.: Linear scan register allocation. In: ACM Transactions on Programming Languages and Systems, Spetember 1999, vol. 21, pp. 895–913 (1999)
Smith, A., Burrill, J., Gibson, J., Maher, B., Nethercote, N., Yoder, B., Burger, D.C., McKinley, K.S.: Compiling for EDGE architectures. In: International Conference on Code Generation and Optimization, pp. 185–195 (2006)
SPEC2000CPU. The standard performance evaluation corporation (SPEC), http://www.spec.org/
Taylor, M.B., Agarwal, A.: Evaluation of the raw microprocessor: An exposed-wire-delay architecture for ILP and streams. In: ACM SIGARCH International Symposium on Computer Architecture, pp. 2–13 (2004)
Traub, O., Holloway, G., Smith, M.D.: Quality and speed in linear-scan register allocation. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, June 1998, pp. 895–913 (1998)
Warter, N.J.: Reverse if-conversion. In: ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 290–299 (1993)
Yoder, B., Burrill, J., McDonald, R., Bush, K.B., Coons, K., Gebhart, M., Govindan, S., Maher, B., Nagarajan, R., Robatmili, B., Sankaralingam, K., Sharif, S., Smith, A.: Software infrastructure and tools for the TRIPS prototype. In: Third Annual Workshop on Modeling, Benchmarking and Simulation (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Robatmili, B., Coons, K., Burger, D., McKinley, K.S. (2008). Register Bank Assignment for Spatially Partitioned Processors. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-89740-8_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)