Abstract
Inline expansion and interprocedural register allocation are two general approaches used for interprocedural optimization. However, there are certain situations which prevent either of them from being applied smoothly to procedure calls. Especially, interactions between inlining and register allocation can cause an inlined version of a program to run more slowly than its noninlined counterpart. This paper describes a method of integrating inlining and interprocedural register allocation to reduce the procedure call overhead without this negative effect. We use profile information to identify the heavy called procedures regions and the register usage information of each code site to optimize the placement of the register save/restore code. This method also takes full advantage offree-use registers at each procedure call site. The average performance improvement is 1.21 compared with the previous schemes that performed either of them independently.
Similar content being viewed by others
References
Anne M. Holler, A study of the Effects of Subprogram Inlining, Ph.D. Thesis, University of Virginia (1991).
J. W. Davidson and A. M. Holler, Subprogram Inlining: A Study of Its Effects on Program Execution Time,IEEE Trans. on Software Engineering 18(2):89–102 (February 1991).
M. D. Smith, M. S. Lam, and M. A. Horowitz, Boosting Beyond Static Scheduling in a Superscalar Processor,Proc. of the 17th Ann. Int. Symp. on Computer Architecture, pp. 344–354 (May 1990).
S. McFarling, Procedure Merging with Instruction Caches,Proc. of the ACM SIGPLAN '91 Conf. on Programming Language Design and Implementation, pp. 71–79 (June 1991).
P. P. Chang, S. A. Mahlke, and W. Y. Chen, Profile-guided Automatic Inline Expansion for C Programs.Software-Practice and Experience 22(5):349–369 (May 1992).
J. W. Davidson and A. M. Holler, A Study of a C Function Inliner,Software-Practice and Experience 18(8):775–790 (August 1988).
K. D. Cooper, M. W. Hall, and L. Torczon, An Experiment with Inline Substitution,Software-Practice and Experience 21(6):581–601 (June 1991).
D. W. Wall, Global Register Allocation at Link Time,Proc. of the SIGPLAN'86 Symp. on Compiler Construction, pp. 264–275 (June 1986).
F. C. Chow, Minimizing Register Usage Penalty at Procedure Calls,Proc. of the SIGPLAN '88 Conf. Programming Language Design and Implementation, pp. 85–94 (June 1988).
V. Santhanam and D. Odnert, Register Allocation Across Procedure and Module Boundaries,Proc. of the ACM SIGPLAN '90 Conference on Programming Language Design and Implementation, pp. 28–39 (June 1990).
P. J. Plauger,The Standard C Library, Prentice Hall, Reading (1992).
S. L. Graham, P. B. Kessler, and M. K. McKusick, gprof: a Call Execution Profiler,Proc. of the SIGPLAN '82 Symp. on Compiler Construction, pp. 120–126 (1982).
H. Zima and B. Chapman,Supercompilers for Parallel and Vector Computers, ACM Press, Addison-Wesley, Reading (1990).
M.-C. Chang, F. Lai, and R.-J. Shang, Exploiting Instruction-Level Parallelism with the Conjugate Register File Scheme,Proc. of the 25th Ann. Int. Symp. on Microarchitecture, pp. 29–32 (December 1992).
W.-C. Hsu, Register Allocation and Code Scheduling for Load/Store Architectures, Computer Sciences Technical Report 722, Ph.D. Thesis, Computer Sciences Department, University of Wisconsin-Madison (October 1987).
A. V. Aho, R. Sethi, and J. D. Ullman,Compilers: Principles, Techniques, and Tools, Addison-Wesley, Reading, Massachusetts (1986).
R. M. Stallman,Using and Porting GNU C, Free Software Foundation, Inc., (January 1991).
J. Hennessy and D. Patterson,Computer Architecture: A Quantitative Approach, Morgan Kaufmann Publishers, Inc., Reading, San Mateo (1990).
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Lai, F., Chao, Yk. & Hsieh, CJ. The complementary relationship of interprocedural register allocation and inlining. Int J Parallel Prog 22, 409–434 (1994). https://doi.org/10.1007/BF02577739
Received:
Issue Date:
DOI: https://doi.org/10.1007/BF02577739